Data Discovery Paradigms

Slides:



Advertisements
Similar presentations
Work Flows of the Online Review System Copernicus Office Editor Copernicus Publications | April 2014.
Advertisements

Co-funded by the European Union under FP7-ICT Co-ordinated by aparsen.eu #APARSEN Welcome to the Conference !! Juan Bicarregui Chair, APA Executive.
YOU ONLY THINK YOU’RE LIKE GOOGLE : COMPARATIVE USER EXPERIENCE OF DISCOVERY PLATFORMS Rice Majors Faculty Director of Libraries Information Technology.
How to start research V. Jayalakshmi. Why do we research? – To solve a problem – To satisfy an itch – To gain more market share/ Develop and improve –
GEO Work Plan Symposium 2012 ID-03: Science and Technology in GEOSS ID-03-C1: Engaging the Science and Technology (S&T) Community in GEOSS Implementation.
Hydro DWG at the RDA Plenary: BoF and Aligning HDWG work with WMO expectations and timeline Sylvain, Tony, Silvano, Ilya.
CEOS-CGMS Working Group on Climate John Bates NOAA/NESDIS/NCEI SIT Workshop Agenda Item 7 CEOS Action / Work Plan Reference CEOS SIT Technical Workshop.
Hydro DWG at the RDA Plenary BoF - Improve sharing of water resource data globally 24 September BREAKOUT :30-15:00.
An adoption phase for RDA WGs?. Background WGs end after 18 months WGs (and some IGs) produce outputs, but adoption of these outputs often only takes.
Update to Members Analysis Team Update on current activities for comment List of proposed new project ideas Opportunity for Members to suggest new projects.
Global Water Information Interest Group meeting RDA 7 th Plenary, 1 st March 2016, Tokyo Global Water Information Interest Group Welcome to the inaugural.
Highlights (Actions) of GSICS Exec Panel 30 October 2009 Volker Gärtner.
BENCH-CAN Coordination
Disciplinary Interoperability Framework
Work Flows of the Online Review System Copernicus Office Editor
Accessing the VI-SEEM infrastructure
RDA Plenary 5 Big Data (Analytics) IG Session
BENCH-CAN Internal evaluation 2nd semester
Overview of WGs, IGs and BoFs
Polices, procedures & protocols
User Characterization in Search Personalization
Preparing a Trustworthy Domain Repository for ISO Certification
TRSS Terminology Registry Scoping Study
WP6. Quality Plan 6.2 Develop a monitoring, evaluation, and quality plan Edited by September, 2017.
CUCSA Workgroup Chair Orientation
WHY? - Found initiative while case statement preparation
Disciplinary Interoperability Framework
RDA Data Fabric (DF) Interest Group Peter Wittenburg & Gary Berg-Cross
Usage scenarios, User Interface & tools
Susanna-Assunta Sansone, Rebecca Lawrence and Simon Hodson
The FAIRMODE ways of communication
FAIR Metrics RDA 10 Luiz Bonino – - September 21, 2017.
Towards completing the Dagstuhl FDM Workshop 15 April 2016 Version v01
WP4 – Knowledge platform and communication
Update on FIM4R David Kelsey
Data Foundation and Terminology (DFT) Vocabulary Development Session
Maggie, Carlo, Peter, Rebecca (GEDE discussions)
Data Discovery Paradigms Interest Group Report on Activities and Outputs Anita de Waard, Siri Jodha Singh Khalsa Fotis Psomopoulis Mingfang Wu.
Maintaining the Clinical & Nonclinical Study Data Reviewer’s Guides
End of Year Performance Review Meetings and objective setting for 2018/19 This briefing pack is designed to be used by line managers to brief their teams.
EOSC Governance Development Forum
CWPAN Opportunity - November 2011
Brief WG/IG reporting Tobias Weigel on behalf of co-chairs
nd Vice Chair’s Report May 2013
WG Research Data Collections Draft outputs of a RDA bottom-up effort P9 - April 2017 Co-chairs: Bridget Almas, Frederik Baumgardt, Tobias Weigel, Thomas.
Archives and Records Professionals for Research Data IG
2018 Leadership Meeting, New Orleans
Research Data Alliance (RDA) 9th WG/IG Collaboration Meeting: Repository Platforms for Research Data (RPRD) Interest Group 13nd June 2018 Co-Chairs:
Repository Platforms for Research Data Interest Group: Requirements, Gaps, Capabilities, and Progress Robert R. Downs1, 1 NASA.
William J Wright ,co-chair TT-IP
Levels of involvement Consultation Collaboration User control
Nonclinical Working Group Update CSS 2014
The Hub Innovation Program Evaluation Plan
nd Vice Chair’s Report – March 2013
Bird of Feather Session
PMISSC Summer 2018 Member Survey
SharePoint has been a pioneer of collaborative work culture and has been dubbed as one of the most successful products by Microsoft for enterprise businesses.
Roger Marks (Huawei) capable 9 January 2019
Green Infrastructure: Working method
Data Discovery Paradigms Interest Group 4 April, 2019 RDA 13th Plenary Meeting, Philadelphia Siri Jodha Singh Khalsa Fotis Psomopoulos Mingfang Wu.
(Project) SIGN OFF PROCESS MONTH DAY, YEAR
NICE has many methods and processes
Invasive Alien Species
Passenger Mobility Statistics 21 May 2015
Co-Chairs: Keith Jeffery, Rebecca Koskela, Alex Ball
Quality of Service Experience DWG OGC Discussion Paper – Status and Overview OGC TC Orléans – March 21, 2018 Cindy Mitchell, Natural Resources Canada.
ICG-WIGOS, eigth session, Jan 2019
- Kick-off meeting - ERANET Cofund BlueBio WP4 (Leader: AEI)
Building the Measuring Stick
Levels of involvement Consultation Collaboration User control
Presentation transcript:

Data Discovery Paradigms RDA Interest Group

Goals of the DDPIG Founding Co-Chairs Goal Stakeholders Siri Jodha Khalsa, Univ. of Colorado Anita de Waard, Elsevier Goal Identify, study and make recommendations concerning issues related to improving data discovery Stakeholders Data producers, data repositories, data seekers

Activities 23 topics identified in Kickoff meeting at RDA#8 74 people signed up for the group Later, these topics refined and voted on, leading to 5 top picks 1. Best Practices for making data findable 2. Use cases, prototyping tools and test collections 3. Metadata enrichment 4. Cataloging common API's 5. Relevancy ranking Task forces were formed and leads identified 1, 2 and 5 got to work immediately Leads of 3 and 4 have been slower to start Two very productive TF leads were asked to become co-chairs Mingfang Wu, Australian National Data Service Fotis Psomopoulos, Aristotle University of Thessaloniki

IG Session at RDA P9 Attendance ~40 First three Task Forces presented their progress and proposed next steps Metadata Enrichment Task Force was formed with new leads Agreed follow-up actions leading to P10:  Relevancy ranking: Sending out questionnaire, collect and prioritise collaborative projects; decide on platform for testbed Use cases: Rank use cases, rewriting document, provide examples of platforms, write final report Best practices: further edits on document, combine into a white paper, submit for publication Metadata enrichment: start regular telecons to plan next steps. 

Best Practices for making data findable Co-Leads: Anita de Waard Jeffrey Grethe William Michener Mingfang Wu Members: 26 Scope Explore current practices of making data finable and to recommend best practices to the data community Activity to Date: Drafted 3 documents Best practices for Data Producers Best practices for Data Repositories Best practices for Data Seekers Plan to submit to journal for publication

Use cases, prototyping tools and test collections Leads: Anita de Waard Antica Culina Fotis Psomopoulos Jens Klump Mingfang Wu Members: 15 Scope Identify key requirements evident across data discovery use-cases from various scientific domains Activity to Date: Collected >60 use cases in the form of: “As a” (i.e. role), “Theme” (i.e. scientific domain/discipline), “I want” (i.e. requirement, missing feature, supported function), “So that” (i.e. what can be accomplished when the user need has been addressed), “Comments”

Relevancy ranking Leads: Members: 11 Scope Activity to Date: Peter Cotroneo Mingfang Wu SiriJodha Khalsa Members: 11 Scope Help with selection of appropriate technologies for improving search functionality Provide a means or forum for sharing experiences/tools/test collections related to relevancy ranking. Work with data search community to explore what are realistic and yet reliable ways for data repositories to carry out relevancy ranking comparison and evaluation tasks Activity to Date: Preparation of survey on relevancy ranking systems to be sent to large list of repositories

Metadata enrichment Leads: Members: TBD Activity to Date: Beth Huffer Ilya Zaslavsky Members: TBD Activity to Date: Two telecons since P9 to discuss scoping Metadata enrichment: Bill Michener (dropped out) / Margaret Spyker (never responded?)

Outreach to other RDA Groups Prior to P8, emails were sent to these RDA Groups inviting feedback on interim outputs from the first 3 task forces active-data-management-plans data-versioning domain-repositories education-and-training-handling-research-data health-data libraries-research-data metadata national-data-services pid preservation-e-infrastructure rdacodata-materials-data-infrastructure-interoperability rdawds-certification-digital-repositories rdawdsx-publishing-data repository-platforms-research-data Research Data Alliance DDPIG Interim Outputs for review and comment send March 23 We wish to share with you the draft outputs created by three of the Task Force teams of the RDA Data Discovery Paradigms Interest Group. We think one or more of these outputs are relevant to the work your IG is doing. Your thoughts and feedback on the three interim documents will be greatly appreciated: • Relevancy Ranking Task Force. See, in particular, items 2 and 3 under "Progress" • Use Cases, Prototyping Tools and Test Collections Task Force. See, in particular, the Google Spreadsheet of captured use cases under "Deliverables" • Best Practices for Making Data Findable Task Force. See, in particular, the three draft best practices documents under "Deliverables". We will also be discussing these outputs at the 9th RDA Plenary in Barcelona, which will take place Tuesday, April 5, 16:00 - 17:30 https://www.rd-alliance.org/ig-data-discovery-paradigms-rda-9th-plenary-meeting. We hope you can join us there! 1 reply - invitation to participate in geospatial IG

Potentially Fruitful Collaborations Sharing of approaches for improving fundability among domain repositories Contributing data discovery use cases to a common database of use cases Providing a testbed for experimentation with retrieval/ranking algorithms. Have offers, suggestions from: US NDS ANDS Elsevier’s AWS EC2 Certainly NDS Labs is an option. Peter also suggested Elsevier can provide AWS EC2 instances for a relevancy test bed. The Elsevier team could probably clone the machines that they used during the recent bioCADDIE Challenge. I will check with our ANDS service director to see if ANDS could provide a corpus of all those metadata records published to Research Data Australia, and ask other repositoris in the survey if they could to the same as well. This will be a good topic for the Plenary: there are testbeds, tech stacks and corpora available, hopefully we will also have a list of relevancy activities from the survey; we can call people to participate activities such as twisting ranking algorithms and parameters, building test collections (developing search topics + relevance assessment), and evaluation.