Developing Data Attribution and Citation Practices and Standards: An International Symposium and Workshop August 22 - 23, 2011 Hotel Shattuck Plaza Data.

Slides:



Advertisements
Similar presentations
Introduction to DataCite Adam Farquhar PhD Head of Digital Library Technology, The British Library President, DataCite June 2010.
Advertisements

DRIVER Building a worldwide scientific data repository infrastructure in support of scholarly communication 1 JISC/CNI Conference, Belfast, July.
Current Trends in Biodiversity Collection Description Neil Thomson The Natural History Museum.
CrossRef Linking and Library Users “The vast majority of scholarly journals are now online, and there have been a number of studies of what features scholars.
Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013.
Lorrie Apple Johnson Lead Librarian, Information Analysis & Services Office of Scientific and Technical Information (OSTI) National Academy of Sciences.
Co-ordinated by aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT The Entity Name System (ENS): A technical infrastructure for implementing.
IDENTIFIERS & THE DATA CITATION INDEX DISCOVERY, ACCESS, AND CITATION OF PUBLISHED RESEARCH DATA NIGEL ROBINSON 17 OCTOBER 2013.
OpenUp! A New Project on Opening up the European Natural History Heritage for EUROPEANA W. G. Berendsohn, A. K. Michel, A. Güntsch, W.-H. Kusber (2011)
Introducing Symposia : “ The digital repository that thinks like a librarian”
OAI Standards for Sheet Music Meeting March 28-29, 2002 Basic OAI Principals How They Apply to Sheet Music Presenter: Curtis Fornadley, Senior Programmer/Analyst.
GLOBAL BIODIVERSITY INFORMATION FACILITY Greg Riccardi Co-chair 9 November Outcomes of the GBIF LSID-GUID Task Group.
BIS TDWG Conference 28 October 2013, Florence Documenting data quality in a global network: the challenge for GBIF Éamonn Ó Tuama, Andrea Hahn, Markus.
DataCite: Making Data Citable Jan Brase (DataCite/TIB Hannover) Brigitte Hausstein (GESIS) Wolfgang Zenk-Möltgen (GESIS)
EZID (easy-eye-dee) is a service that makes it simple for digital object producers (researchers and others) to obtain and manage long-term identifiers.
THE DATA CITATION INDEX AN INNOVATIVE SOLUTION TO EASE THE DISCOVERY, USE AND ATTRIBUTION OF RESEARCH DATA MEGAN FORCE 22 FEBRUARY 2014.
Chinese-European Workshop on Digital Preservation, Beijing July 14 – Network of Expertise in Digital Preservation 1 Persistent Identifiers Reinhard.
Implementing Digital Object Identifiers at the GESIS Data Archive for the Social Sciences Workshop “Persistent Identifiers for the Social Sciences” Bonn,
Digital Library Architecture and Technology
Tobias Weigel (DKRZ) Tobias Weigel Deutsches Klimarechenzentrum (DKRZ) Persistent Identifiers Solving a number of problems through a simplistic mechanism.
SERNEC Image/Metadata Database Goals and Components Steve Baskauf
II Course on GBIF Node Management Arusha, Tanzania 31 st October and 1 st November 2008 Tim ROBERTSON Systems Architect GBIF Secretariat Data Publishing.
IDs in and out of the database Entomological Collections Network (ECN) 2012 November 10 – 11, Knoxville, TN Debbie Paul, Greg Riccardi.
CrossRef, DOIs and Data: A Perfect Combination Ed Pentz, Executive Director, CrossRef CODATA ’06 Session K4 October 25, 2006.
GLOBAL BIODIVERSITY INFORMATION FACILITY Dr Vishwas Chavan Senior Programme Officer for DIGIT Data Citation Mechanism and.
GLOBAL BIODIVERSITY INFORMATION FACILITY The Global Biodiversity Information Facility (GBIF ): The distributed architecture Samy Gaiji Head of Informatics.
11 th GBIF Global NODES Meeting Incentivising and Strategising Publishing of Biodiversity Data Vishwas Chavan Senior Programme Officer for Digitisation.
THOMSON SCIENTIFIC Web of Science 7.0 via the Web of Knowledge 3.0 Platform Access to the World’s Most Important Published Research.
1 CrossRef - a DOI Implementation for Journal Publishers January 29, 2003 CENDI Workshop.
Mid-Term GBIF Committees Meetings eLearning Alberto González Talaván Global Biodiversity Information Facility (GBIF) May 2011.
Citing Data Sets in the Literature: ORNL DAAC Practices Robert Cook, Suresh SanthanaVannan, and Daine Wright Environmental Sciences Division Oak Ridge.
Dimitris Koureas, PhD Natural History Museum London Linking layers of biodiversity data: Informatics challenges for the long tail research RDA - Long Tail.
Progress since the February 2005 London DNA Barcode of Life Conference Scott Miller, Chair Consortium for the Barcode of Life Smithsonian Institution.
GBIF Publishing Platform May Core publishing focus Primary Biodiversity Data (Specimens & Observations, Ecological Data) - Core data type is an.
1 GBIF and Ocean Biodiversity, OBI'07 Conference, Oct 2-4, 2007, Dartmouth, Nova Scotia GBIF and Ocean Biodiversity Building the data web with OBIS Éamonn.
DOI & Crossref Arnoud de Kemp Springer-Verlag
Joint Declaration of Data Citation Principles Notes [1] CODATA 2013: sec 3.2.1; Uhlir (ed.) 2012, ch 14; Altman &
1 Annual Meeting 2004 CrossRef Publishers International Linking Association, Inc Charles Hotel, Cambridge, MA November 9 th, 2004.
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.
Global Biodiversity Information Facility GLOBAL BIODIVERSITY INFORMATION FACILITY Meredith A. Lane CODATA/ERPANET Workshop: Scientific Data Selection &
Managed by UT-Battelle for the Department of Energy Mercury – Distributed Metadata Tool for Finding and Retrieving CDIAC Data CDIAC UWG Meeting September.
GBIF Data Access and Database Interoperability 2003 Work Programme Overview Donald Hobern, GBIF Programme Officer for Data Access and Database Interoperability.
Research Grants and Projects Discovery Service ANDS Webinar 12th August 2015 Monica Omodei, ANDS.
Biodiversity literature mark-up Compelling use cases for Natural History Collections Dr Dimitris Koureas Natural History Museum London Workshop on mark-up.
CBD CoP 11 Special Event National Biodiversity Information Outlook (NBIO) Vishwas Chavan 15 October 2012 Hyderabad.
Weaving Data into the Scholarly Information Network UNECE Work Session on the Communication of Statistics OECD Conference Centre, Paris June 30 - July.
Scratchpads and the new Biodiversity Data Journal Biodiversity Data Publishing made… easier Dimitris Koureas Natural History Museum London.
4 way comparison of Data Citation Principles: Amsterdam Manifesto, CoData, Data Cite, Digital Curation Center FORCE11 Data Citation Synthesis Group Should.
Incentives for Biodiversity Data Publishing June 2011.
Acronym Soup GBIF, TDWG & GUIDs Jerry Cooper. Global Biodiversity Information Facility (GBIF) Established in 2000 through non-binding MOU (25 countries.
George E. Brown, Jr. Network for Earthquake Engineering Simulation 4 th regular meeting of the NEES preservation advisory committee Stanislav Pejša
IABIN Executive Committee / Coordinating Institution Meeting GBIF and IABIN: status and opportunities in 2011 Juan Bello, Mélianie Raymond & Alberto González-Talaván.
FUNCTIONALITY OF ResearcherID James Cook University Celebrating Research 9 OCTOBER 2009 Steven Werkheiser Manager, Customer Education & Training ANZ Thomson.
GLOBAL BIODIVERSITY INFORMATION FACILITY Vishwas Chavan Senior Programme Officer for DIGIT 10 th Meeting of the GBIF Participant Node Managers Committee.
John Wieczorek Information Architect Museum of Vertebrate Zoology, UC Berkeley Buenos Aires (Argentina) 28 September 2011 Training.
Laura Russell VertNet Meherzad Romer NatureServe Canada John Wieczorek
1 CS 502: Computing Methods for Digital Libraries Guest Lecture William Y. Arms Identifiers: URNs, Handles, PURLs, DOIs and more.
GLOBAL BIODIVERSITY INFORMATION FACILITY Vishwas Chavan and Eric Gilman 10 th Meeting of the GBIF Participant Node Managers Committee 3 – 5 October 2009.
GLOBAL BIODIVERSITY INFORMATION FACILITY David Remsen Senior Programme Officer, ECAT 3 Oct th Nodes Meeting.
Joint Declaration of Data Citation Principles (Overview) The Data Citation Synthesis Group Joint Declaration.
PERSISTENT IDENTIFIERS FOR THE UK: SOCIAL AND ECONOMIC DATA …………………………………………………………………………………………………… LOUISE CORTI …………………….…………………………….… UK DATA ARCHIVE.
Click to edit Master title style Click to edit Master text styles Second level Third level Fourth level Fifth level 1 SI O S Svalbard Integrated Arctic.
2016 WORK PROGRAMME PROGRESS UPDATE May Progress 21 sampling-event datasets.
2016 “OPEN IN ACTION”.
OceanDocs Digital Repository of Marine Science Research Outputs
Link Resolver and Knowledge Base in Discovery Services
Cody W. Thompson, Ph.D. University of Michigan
GBIF Today and Tomorrow
Presentation transcript:

Developing Data Attribution and Citation Practices and Standards: An International Symposium and Workshop August , 2011 Hotel Shattuck Plaza Data Citation Mechanism and Service for Scientific Data: Defining Framework for Biodiversity Data Publishers Vishwas Chavan Global Biodiversity Information Facility (GBIF) Secretariat August 22, 2011

GBIF, facilitating free and open access to biodiversity data Last updated: million Data records Data resources 342 Data Publishers

Anticipated impact of Data Citation Data Citation Data Discovery Data Publishing Data Preservation Data Use

Current Practice of Data Citation Source: GBIF Data Portal, data.gbif.net Search string: Panthera tigris Search results: 696 records, from 37 datasets, published by 31 Data Publishers Date: Thursday, 4 November 2010, Time: Source: GBIF Data Portal, data.gbif.net Search string: Panthera tigris Search results: 696 records, from 37 datasets, published by 31 Data Publishers Date: Thursday, 4 November 2010, Time: What was the search string? How many records were retrieved? How many Data Publishers contributed to the data? When search was carried out? Who is the original contributor of the data? Who played what role from collection to publishing? Please cite this data as follows: (accessed through GBIF data portal, Mammal specimens, (accessed through GBIF data portal, Vertebrate specimens, (accessed through GBIF data portal, Natural History Museum Rotterdam, (accessed through GBIF data portal, Database Schema for UC Davis Wildlife museum, (accessed through GBIF data portal, UNSM Vertebrate Specimens, Please cite this data as follows: (accessed through GBIF data portal, Mammal specimens, (accessed through GBIF data portal, Vertebrate specimens, (accessed through GBIF data portal, Natural History Museum Rotterdam, (accessed through GBIF data portal, Database Schema for UC Davis Wildlife museum, (accessed through GBIF data portal, UNSM Vertebrate Specimens, Existing Data Citation style How can I retrieve the same result? Un-answered facts

What is needed? Data citation mechanism Data citation style Recognise ALL with their roles (producer, publisher, aggregator, curator) Cascading Citations - citations within citations Data Citation Service Citation Registry service Citation Resolution service Discovery service Resolve to Full text citation Link to Underlying data

Solution under consideration

Style for Publisher determined citations: Key considerations  Types of Publishers  Publisher (individual)  Publisher (group of individuals)  Institution or Research Group or Consortium  Recognizing roll played by individuals  Collector, Curator, Administrator etc.  Release / Update frequency  One time release  Frequent updates  Primary URI to access dataset  Persistent Identifier of the dataset and/or citation  Dates of first release, and latest update  Nos. of data records

Publisher (individual) one time data release Publisher (YEAR),,, published,, released on,. Rumble KJ (1998). Cephalopods of North America records, published online, released on 31/12/1998, doi: /iisc Data Citation style for Publishers......

Publisher (group of individuals) frequent updates Publisher 1,..... and Publisher n,, published,, first released on,,. Remsen D, Bello J, Sheldon S, Raymond M, and AJK Arino (2005 -). Fishes of the Cape Cod Region, MA,USA records published online, first released on 17/05/2005, last updated on 10/10/2010, doi: /mbl Data Citation style for Publishers.....

Institute/Research Group/ Consortium – frequent updates,,,,,,,. Smithsonian National Museum of Natural History (2002 -), Museum Collection Records: Mammals records. Contributed by Helgen KM (Principal Investigator, cutrator, author), Gordon LK (manager, author, curator), Peurach SC (author, manager), Potter CW (manager, author), Carleton MD (curator), Maldonado JE (author, developer), Wilson DE (curator, author), Thorington Jr RW (curator, author, validator), Ludwig CA (manager, developer, author), Lunde DP (author). Published online, first released on 12/02/2002, last updated on 15/09/2010, doi: /smi Data Citation style for Publishers.....

User driven Citations: Key considerations Citations within citations Three types of citations User driven citations Publisher determined citations Use of Persistent Identifiers (PI) Persistent Identifiers for each citation types Support multiple types of PIs – Handles, ARK, PURL, URN, LSID, DoI etc.

Data Citation in future: hypothetical exemplification Source: GBIF Data Portal, data.gbif.net Search string: Panthera tigris Search results: 696 records, from 37 datasets, published by 31 Data Publishers Date: Thursday, 4 November 2010, Time: Source: GBIF Data Portal, data.gbif.net Search string: Panthera tigris Search results: 696 records, from 37 datasets, published by 31 Data Publishers Date: Thursday, 4 November 2010, Time: (2010). user doi: /gbif (2010). user doi: /gbif User Search User driven citation using Persistent Identifier resolve to full composite citation and/or snapshot of resultant data can be accessed

(2010). Search string:Panthera tigris, 696 records, contributed by 37 data resources, user doi: /gbif , accessed on 04/11/2010, 10:03:30. 1.Louisian State University (2007), Museum of Natural Science: Collection of Mammal, records. Contributed by Patterson DN (Principal Investigator, architect, author), Sandeep PK (author, curator), Fieldman LN (author, developer), Remsen D (curator, validator), published online released on October 2007, doi: /lsu Michigan State University (2001 -), MSU Vertebrate Collection, records. Contributed by Cook DK (Principal Investigator, author, curator, validator), Hirsh L (author, architect, developer), Lane MP (manager, author, curator) , Morris JH (curator), published online first released on 01/10/2001, last updated on 18/01/2010, urn:lsid:msu.org:observation: Cursada PK, Bello J, and AJK Moelicker (2006), Natural History Museum Rotterdam: Mammal collection, 1123 records, published online, released on 7 July 2006, Rumble KJ (1998 -). Vertebarte collection of Rumble records, published online, first released on 13/09/1998, last updated on 27/01/2010, (2010). Search string:Panthera tigris, 696 records, contributed by 37 data resources, user doi: /gbif , accessed on 04/11/2010, 10:03:30. 1.Louisian State University (2007), Museum of Natural Science: Collection of Mammal, records. Contributed by Patterson DN (Principal Investigator, architect, author), Sandeep PK (author, curator), Fieldman LN (author, developer), Remsen D (curator, validator), published online released on October 2007, doi: /lsu Michigan State University (2001 -), MSU Vertebrate Collection, records. Contributed by Cook DK (Principal Investigator, author, curator, validator), Hirsh L (author, architect, developer), Lane MP (manager, author, curator) , Morris JH (curator), published online first released on 01/10/2001, last updated on 18/01/2010, urn:lsid:msu.org:observation: Cursada PK, Bello J, and AJK Moelicker (2006), Natural History Museum Rotterdam: Mammal collection, 1123 records, published online, released on 7 July 2006, Rumble KJ (1998 -). Vertebarte collection of Rumble records, published online, first released on 13/09/1998, last updated on 27/01/2010, Institutional dataset, onetime release, doi Institutional dataset, frequent update, lsid Full text composite citation Multiple authors, frequent update, ARK User driven citation Single author, frequent update, handel (2010). user doi: /gbif (2010). user doi: /gbif

Proposed Data Citation mechanism Publisher determined citations Detailed citation as part of metadata document, and/or Register citation at ‘Citation Service’ Persistent Identifier is assigned to metadata document and/or citations alone User driven citations Search data through Publisher access point Single dataset – Search result together with Publisher determined citation Multiple datasets – Search result together with all datasets User write ‘user driven’ string of citation User register citation with ‘Citation Service’ User archive snapshot of search linked to ‘user driven citation’

Process for Publisher determined citations Publishing ready dataset Publisher author ‘Metadata’ using metadata catalogue Metadata catalogue assign Persistent Identifier to the metadata document Publishing ready dataset Publisher write detailed citation using ‘Citation Registry Service’ Citation Registry Service assign the Persistent Identifier to the citation

Process for User driven citations User access data Finalise dataset for use Use ‘Citation Service’ Citation Service returns PI Archive dataset used

Issues and Challenges Complexity of data management Complexity of data networks Best practice guide for data citation Establishing ‘Data Citation Service’ Socio-cultural challenges Archival of dataset cited by user Sustenance of ‘Data Citation Service’

Data Paper: as an alternate solution

Data Paper: Incentivising Data Discovery

Reward data publishing Data Paper Metadata document

Data Publishing = Scholarly Publishing!