RDA Wheat Data Interoperability Cookbook and last developments 9 th March 2015, San Diego.

Slides:



Advertisements
Similar presentations
28 March 2003e-MapScholar: content management system The e-MapScholar Content Management System (CMS) David Medyckyj-Scott Project Director.
Advertisements

IHWG Workshop Data Tools for HLA Sequence.
Integrating Genome and Transcriptome Resources into TreeGenes Jill Wegrzyn David Neale Doreen Main Keithanne Mockaitis.
The Maize Inflorescence Project Website Tutorial Nov 7, 2014.
1 Welcome to the Protein Database Tutorial This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○
Gene Ontology John Pinney
The NSDL Registry Diane Hillmann  Jon Phipps. What We’re Doing Received an NSF grant in Oct. 2006, to: Register metadata schemas, vocabularies, application.
We are developing a web database for plant comparative genomics, named Phytome, that, when complete, will integrate organismal phylogenies, genetic maps.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Claire O’Donovan EMBL-EBI. In UniProtKB, we aim to provide… o A high quality protein sequence database A non redundant protein database, with maximal.
Genome database & information system for Daphnia Don Gilbert, October 2002 Talk doc at
RDA Wheat Data Interoperability Working Group Outcomes RDA Outputs P5 9 th March 2015, San Diego.
RDA Wheat Data Interoperability Working Group Outcomes RDA Outputs P5 9 th March 2015, San Diego.
What is Business Analysis Planning & Monitoring?
Gene expression services: ArrayExpress and the Gene Expression Atlas Contact: Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Erice 2008 Introduction to PDB Workshop From Molecules to Medicine: Integrating Crystallography in Drug Discovery Erice, 29 May - 8 June Peter Rose
Supporting open data & models in the agri-food sector: Experiences from the RDA Agriculture Data IG & Wheat Data Interoperability WG V. Protonotarios (Agro-Know),
Addressing Metadata in the MPEG-21 and PDF-A ISO Standards NISO Workshop: Metadata on the Cutting Edge May 2004 William G. LeFurgy U.S. Library of Congress.
Viewing & Getting GO COST Functional Modeling Workshop April, Helsinki.
Databases in Bioinformatics and Systems Biology Carsten O. Daub Omics Science Center RIKEN, Japan May 2008.
Support for MAGE-TAB in caArray 2.0 Overview and feedback MAGE-TAB Workshop January 24, 2008.
The Marine Metadata Interoperability Project A Model for Community Collaboration September 23, 2010 Nan Galbraith WHOI.
Gramene Objectives Develop a database and tools to store, visualize and analyze data on genetics, genomics, proteomics, and biochemistry of grass plants.
Gene Expression Omnibus (GEO)
STANDARDS AND INTEROPERABILITY; RIGHTS ISSUES Status and summary 1.
The MMI Tools Carlos Rueda Monterey Bay Aquarium Research Institute OOS Semantic Interoperability Workshop Marine Metadata Interoperability Project Boulder,
CANDID: A candidate gene identification tool Janna Hutz March 19, 2007.
GENOME-CENTRIC DATABASES Daniel Svozil. NCBI Gene Search for DUT gene in human.
Fission Yeast Computing Workshop -1- Searching, querying, browsing downloading and analysing data using PomBase Basic PomBase Features Gene Page Overview.
NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006.
Wheat Data Interoperability. 2  Endorsed in March 2014  Focus:  Improve/reach semantic interoperability of Wheat data  The WG will focus first on.
Topic Rathachai Chawuthai Information Management CSIM / AIT Review Draft/Issued document 0.1.
Alastair Kerr, Ph.D. WTCCB Bioinformatics Core An introduction to DNA and Protein Sequence Databases.
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
Copyright OpenHelix. No use or reproduction without express written consent1.
Building WormBase database(s). SAB 2008 Wellcome Trust Sanger Insitute Cold Spring Harbor Laboratory California Institute of Technology ● RNAi ● Microarray.
1 Understanding Cataloging with DLESE Metadata Karon Kelly Katy Ginger Holly Devaul
Generic Database. What should a genome database do? Search Browse Collect Download results Multiple format Genome Browser Information Genomic Proteomic.
This tutorial will describe how to navigate the section of Gramene that provides descriptions of alleles associated with morphological, developmental,
Wheat Data Interoperability Esther DZALE YEUMO KABORE Richard FULSS.
Copyright OpenHelix. No use or reproduction without express written consent1.
GVS: Genome Variation Server Materials prepared by: Warren C. Lathe, PhD Updated: Q Version 2.
Introduction to the GO: a user’s guide NCSU GO Workshop 29 October 2009.
OSU | PSU | UO The Oregon Spatial Data Library: A Vision for Increased Data Sharing Myrica McCune Institute for Natural Resources Marc Rempel Oregon State.
EBI is an Outstation of the European Molecular Biology Laboratory. Gautier Koscielny VectorBase Meeting 08 Feburary 2012, EBI VectorBase Text Search Engine.
2.An overview of SDMX (What is SDMX? Part I) 1 Edward Cook Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October 2015.
Exploring and Exploiting the Biological Maze Zoé Lacroix Arizona State University.
An Introduction to NCBI & BLAST National Center for Biotechnology Information Richard Johnston Pasadena City College.
Describing resources II: Dublin Core CERN-UNESCO School on Digital Libraries Rabat, Nov 22-26, 2010 Annette Holtkamp CERN.
Welcome to the Protein Database Tutorial. This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
ICSU-WDS & RDA Data Publication Services WG. 2 Linking Research Data and the Literature: why? Why link? 1.Increase visibility & discoverability of research.
Webinar Wheat Data Interoperability guidelines Esther Dzalé Yeumo Richard Fulss.
Cyril Pommier et al. / Feedback from the RDA and WheatIS recommendations for Wheat Data Interoperability Adoption of the Wheat Data Interoperability Guidelines.
MICHAEL and the European Digital Library: promoting teaching, learning and research The MICHAEL Project is funded under the European Commission eTEN Programme.
Diversity Seek (DivSeek)
C. Pommier, E. Dzalé Yeumo Kabore, E. Arnaud, P. Larmande, M. Alaux, H
Wheat Data Interoperability Esther DZALE YEUMO KABORE Richard FULSS
CottonGen: An Up-to-Date Resource Enabling Genetics, Genomics and Breeding Research for Crop Improvement Plant and Animal Genome Conference XXV Jing Yu1,
Functional Annotation of the Horse Genome
Applications of IFLA Namespaces
Florian Gräf Software Developer of the McEntyre group at EMBL-EBI
Overview Gene Ontology Introduction Biological network data
Welcome to the Protein Database Tutorial
2. An overview of SDMX (What is SDMX? Part I)
Pathway Informatics December 5, 2018 Ansuman Chattopadhyay, PhD
2. An overview of SDMX (What is SDMX? Part I)
How to Effectively Search and Download Data in CottonGen
Presentation transcript:

RDA Wheat Data Interoperability Cookbook and last developments 9 th March 2015, San Diego

2 The WDI working group in brief  Endorsed by RDA in March 2014  Members: ~=30 members and 15 active members, Wheat scientists, data and metadata technologists  The goal: contribute to the improvement of Wheat related data interoperability by  Building a common interoperability framework (metadata, data formats and vocabularies)  Providing guidelines for describing, representing and linking Wheat related data

3  Deliverables  A report of the survey of existing standards  A cookbook intended for the Wheat data managers community, which provides them with guidelines on what data formats, metadata, vocabularies and ontologies they should use to describe, represent and link different types of Wheat data.  A library of linked vocabularies and ontologies in machine readable formats with respect to the Linked Data standards.  A prototype which showcases the gain of interoperability Initial plans

4 Where we are Surveys Landscape of Wheat related standards and their use by the community Comprehensive overview of Wheat related ontologies and vocabularies Workshops Recommendations Mappings between different data formats Actions to conduct in order to improve the current level of Wheat related data interoperability Interoperability use cases Implementation Interactive cookbook: recommendations + guidelines A repository of Wheat related linked vocabularies (Bioportal)

Wheat related standards survey and workshop

6 Data typeData formats currently usedRecommendations StandardizedTool specificNon standardized SNPsVCFBAM/SAM, BED, VARSCAN, VEP VCF files generated by using the survey sequences of IWGSC + metadata about VCF files to enrich the information about the SNPs. genome annotations Genbank Flat File, General Feature Format (GFF), EMBL GFF 3 + specifications with regard the description of specific columns GermplasmsMPCD, ABCD, Darwin Core, Darwin Core Germplasm Grin GlobaltabulatedMPCD Gene expression Many format standards laid out by repositories such as NCBI (GEO) and EBI Array Express Existing format standards laid out by the repositories such as NCBI (GEO) and EBI Array Express + ENA Physical mapsGFFCmap, fpcGFF3 Genetic mapsCmap, gnpmapGFF3 (to be confirmed) PhenotypesDrops, ped, isa- tab, ephesis tabulatedIsa-tab

7 Examples of use cases TitleSearching for germplasm with specific traits DescriptionExample of searching for germplasm with specific traits - tagged with ontology terms? Data types Germplasm Phenotype Challenges ●Metadata very important ~ standardized format ●Association of genes to traits, linked to germplasm, marker information ●Need for quality controls- how confident are you of the data source? ●Provenance of the germplasm- pedigree, ownership, ●Standard system for tracking germplasm, names Title Identification of wheat genes that control root growth DescriptionRequires: Annotated genes (Gene Ontology, PFam, and other functional annotation) Data typesGenomic annotations? - Gene location ? (IWGS-SS ID or MIPS HCS link) Challenges Mapping between wheat genes and orthologs from other species (deduce function by seq. similarity); Access to RNASeq data (genes that are not expressed in roots may be irrelevant) ; mapping of wheat genes and information on their function based on literature TitleQuery on trial data associated with varieties Data typesPhenotypic data, GIS data, (wheat economy/production data) Description To search wheat varieties with distribution maps, production figures, performances in wheat mega environments, associated projects worldwide plus layers of climatic data on specific wheat production areas and disease prevention information. ChallengesPhenotypic data should be linked to GIS data. Using keywords or ontology terms a system or a tool should be able to pull out such information from different websites/systems developed by wheat community.

8

Wheat related ontologies and vocabularies survey

10  Assess the level of visibility and interoperability of Wheat related vocabularies and ontologies  Is the vocabulary/ontology updated regularly?  What license and/or copyright is used?  Is the vocabulary/ontology part of any ontology communities or listing services?  Is the vocabulary/ontology used or implemented in any database/repository?  Does the vocabulary/ontology interlink and/or map to other vocabularies and ontologies?  Does the vocabulary/ontology  Identify the domain covered by the ontologies and vocabularies  Refine the cookbook  Collect more interoperability use cases  Collect some technical details The objectives of the survey

11 The objectives of the survey Guidelines and Repository What level of visibility/operability? What content? What formats, and technologies?

The Wheat related BioPortal allows one to search for terms across multiple ontologies, browse mappings between terms in different ontologies, receive recommendations on which ontologies are most relevant for a corpus, annotate text with terms from ontologies

13  Metadata (harmonization, minimal metadata sets)  Mappings  Next workshop (summer 2015)  Review and complete the recommendations  Refine and complete the guidelines and the best practices  Finalize the repository of Wheat related vocabularies  Implement the prototype Next steps

14 Thanks!