Presentation is loading. Please wait.

Presentation is loading. Please wait.

1111 The Generation Challenge Programme (GCP) Platform for Crop Research Richard Bruskiewich and the rest of …

Similar presentations


Presentation on theme: "1111 The Generation Challenge Programme (GCP) Platform for Crop Research Richard Bruskiewich and the rest of …"— Presentation transcript:

1 1111 The Generation Challenge Programme (GCP) Platform for Crop Research Richard Bruskiewich and the rest of …

2 …The GCP SP4 team and Contributors IRRI-CIMMYT Crop Research Informatics Laboratory Graham McLaren Thomas Metz Martin Senger Ramil Mauleon Mylah Anacleto Michael Jonathan Mendoza Victor Jun Ulat Arllet Portugal Ryan Alamban Lord Hendrix Barboza Jeffrey Detras Kevin Manansala Jeffrey Morales Barry Peralta Rowena Valerio Nelzo Ereful CIP: Reinhard Simon Edwin Rojas ICRISAT: Jayashree Balaji ICARDA: Akinnola Akintunde NCGR: Andrew Farmer Gary Schiltz SCRI: Jennifer Lee David Marshall Cornell University: Terry Casstevens Pankaj Jaiswal Dave Matthews ACGT: Ayton Meintjes Jane Morris CIRAD: Manuel Ruiz Alexis Dereeper Matthieu Conte Brigitte Courtois Bioversity: Mathieu Rouard Tom Hazekamp Milko Skofic Raj Sood NIAS: Masaru Takeya Koji Doi Kouji Satoh Shoshi Kikuchi EMBRAPA: Marcos Costa Natalia Martins Georgios Pappas Guy Davenport Trushar Shah Kyle Braak Sebastian Ritter Yi Zhang Sergio Gregorio Joseph Hermocilla Michael Echavez Roque Almodiel Samart Wanchana Supat Thongjuea Theo van Hintum (WUR), GCP Subprogramme 4 Leader University of British Columbia: Mark Wilkinson GSC Bioinformatics Graduate Program, BC Cancer Agency: Benjamin Good James Wagner

3 Overview Generation Challenge Programme crop informatics research and development GCP platform architecture:  Domain model & ontology  Application development framework

4 Challenge Programme “I challenge the next generation to use new scientific tools and techniques to address the problems that plague the world’s poor” Dr. Norman Borlaug http://www.generationcp.org

5 An international research programme established in 2003, projected to last 10 years, and hosted by the CGIAR with global partners from ARI and NARES Research Themes Directed to Crop Improvement:  Genomics and comparative biology across species  Characterization of genetic diversity for allele mining  Gene transfer technologies Five research subprogrammes, one of which is crop information systems development. What is it?

6 Challenge Programme Cornell University USA Wageningen University Netherlands John Innes Centre UK NIAS Japan Agropolis France CIP Peru CIAT Clombia CIMMYT Mexico Bioversity Italy WARDA Cote d’Ivore IRRI Philippines ICRISAT India ICARDA Syrian Arab Rep. IITA Nigeria EMBRAPA Brazil BioTec Thailand ACGT South Africa ICAR India CAAS China

7 Genomic annotation, Forward and Reverse Genetics, Gene arrays/gels Candidate genes NILs, RILs Mapping pop. Mutants Beneficial alleles Linked to Traits Genebank Germplasm Genotyping & Phenotyping Value-added varieties Advanced breeding lines as vehicles Marker-aided Selection/ Transformation Process Genetic Resources Product SP2: Functional Assignment SP1: Allelic Mining SP3: Trait Synthesis GCP Research: from Genotype to Phenotype

8 Anatomical Developmental Field Performance Stress Response Genotype Germplasm Phenotype Molecular Expression Environmen t Integration across Diverse Crop Data Inventory Identification (passport) Genealogy Genetic Maps Physical Maps DNA Sequence Functional Annotation Molecular Variation (Natural or Induced) Location (GIS) Climate Day Length Ecosystem Agronomy Stresses Transcripteome Proteome Metabolome Physiology has determines affects

9 Crop Information Systems: the Next Large, globally distributed consortium Diverse research requiring a diversity of tools Large data sets with diverse data types Many legacy informatics systems and tools Global data integration required… Key Issue: Interoperability

10 Some Basic GCP Research Objectives Compile a list of germplasm meeting specific passport data criteria Compile a list of genetic markers of interest from genetic and QTL maps Retrieve genotypes of specified markers, for specified germplasm Align gene expression data against QTL positional evidence to identify candidate gene loci for specified traits

11 A Generalized GCP Crop Research Integration Work Flow Comparative Map & Trait Viewer (NCGR/ISYS) Genetic Map Data Source(s) Generation Challenge Programme Domain Model & Middleware Germplasm Passport/ Phenotype/ Genotype Querybuilder Comparative (Functional) Genomics Tools DIVA-GIS Germplasm Data Source(s) Genomics Data Source(s) GIS Data Source(s) Get/analyse a genetic map Find germplasm genotyped with mapped markers Get genotype & phenotype of germplasm Get candidate genes in map interval Get functional information about genes Plot germplasm, genotype and phenotype on geographical maps Analyse source environment of germplasm Select “interesting” candidate genes; get alleles Select adapted germplasm with favorable phenotype & alleles for further evaluation

12 An environment that provides improved access to data and analysis tools applications integrated databases and tools GCP Information Platform: User Perspective

13 GCP Information Platform – Developers’ Perspective application layer middleware internet TapirMOBY, etc. Data Registry local database layer

14 Generation CP Platform http://pantheon.generationcp.org

15 GCP Platform - General Architecture “Model Driven Architecture” based on “platform independent” GCP scientific domain models, parameterized with controlled vocabulary (“ontology”) GCP domain models mapped onto platform specific implementations. Reference (Java) GCP platform application programming interface (API)

16 Semantics of the GCP Model Driven Architecture GCP is trying to model the meaning (“semantics”) of the crop research world. Semantics is found in the domain model at three distinct but interconnected levels:  System architectural level: general scientific semantics in terms of high-level object concepts (“object types”) and their global inter-relationships.  Entity level: attributes and behaviors internal to high-level object types.  Attribute level: attribute values of objects that range over data types: simple (e.g. identifiers, numbers), complex (other classes of entities) or ontology (such as Gene Ontology (GO) terms, for a gene product).

17 Germplasm Phenotype has an Attribute Value Observable with a has a ranges over Plant Ontology Layers of Semantics 1 Object Model of the Scientific Domain… 2 3 …Parameterized with Ontology

18 GCP Domain Model Specification High-level object types are specified with Unified Modeling Language (UML) and associated text narratives. Major object classes are represented in the object model. More specialized object types are specified by subclassing major object types using ontology. Reference model is coded by Eclipse Modeling Language managed with source code versioning and automatically compiled into other representations. http://pantheon.generationcp.org/demeter

19 Scope of GCP Domain Model & Ontology Core models: generic concepts – identification, entities, features, organization, data management  Models heavily parameterized by ontology (e.g. entity and feature “type” attributes) Scientific models: extends core model into specific scientific scopes relevant to GCP:  Germplasm data (including genetic resources passport)  Genomics including genotypes, maps, sequences and functional annotation.  Phenotype data  Environmental data (including geographical location)

20 GCP Ontology Every attribute in the GCP domain model with data type SimpleOntologyTerm or subclass thereof, is an integration point for an external ontology. External public ontology (e.g. GO, PO, SO) reused when available, and new ontology developed within GCP to fill gaps. Ontology consolidated into GCP database based on GMOD Chado CV tables, indexed within platform using a GCP formatted identifier (that retains the source’s identifier).

21 GCP Domain Model Mappings onto Platform Specific Implementations GCP Platform Java Middleware & Applications OWL/RDF Ontology: VPIN/SSWAP.info SOAP Web Services (BioMOBY, SoapLab, GDPC) XML Schemata: GCP Data Templates, BioCASE/Tapir GCP Domain Model (UML/EMF) GCP Ontology Database http://pantheon.generationcp.org/demeter

22 Reference GCP Platform API PantheonBase: a relatively simply core Java Application Programming Interface (API) for software integration:  DataSource: query data resources, using simple, ontology-driven SearchFilter specifications  DataTransformer: computational input/output  DataConsumer: communicate data to viewers http://pantheon.generationcp.org

23 GCP DataSource Interface

24 DataSource Interface

25 GCP Data Source Implementations Direct Integration of relational databases (Spring HttpInvoker, Hibernate, JPA):  Developed for ICIS, GMOD Chado (beta) Protocols:  Generalized Java Client to connect to BioMoby web services; Java support for GCP-compliant BioMoby web service provider development (beta)  Support for BioCase/Tapir data source integration (prototyped)  GCP-compliant GDPC data source (prototyped)  SSWAP/VPIN wrapper (under discussion) Some other direct custom data source wrappers

26 Some GCP BioMOBY docs… http://cropwiki.irri.org/gcp/index.php/MOBY_Rice_Network http://pantheon.generationcp.org/moby http://moby.generationcp.org

27 GCP BioMoby Support – a Synopsis 1.MoSES + Dashboard developed (M. Senger). 2.GCP model specific BioMoby datatypes specified. 3.Java libraries partly developed for interconversion of GCP BioMoby data types to/from GCP domain model Java objects (Barboza). 4.GCP DataSource Java implementation developed for client side of BioMoby that maps GCP DataSource find() use cases onto BioMoby web services using a using XML configuration files (no coding). 5.Java design pattern for modular implementation of BioMoby web services that get their data from any GCP-compliant DataSource that supports a given find() use case.

28 GCP BioMoby “Sandwich”

29 (Partial) Inventory of 3 rd Party Data Resources targeted for wrapping as GCP Data Sources Data TypeDescription Microarray DataMAXD database with microarray datasets from diverse GCP commissioned or competitive projects. Genetic and QTL Mapping Data QTL data available in ICIS, TropGenes. Genomic Diversity and Phenotype Connector (GDPC) connecting to Gramene, Panzea, GrainGenes et al. Genomic Sequence Data and Annotation NIAS KOME full length cDNA and RAP genome databases (?), connected to GCP web services by NIAS. OryzaSNP and GCP comparative genomic databases. Public sequence databases (via BioJava?) Functional GenomicsOryGenesDb mutant data (CIRAD); IR64 rice mutant database (IRRI); Tos17 database (NIAS). Germplasm Sample Characterization Data Germplasm, passport, genotype and associated field data available in ICIS databases; TropGenes, MGIS, ICRIS.

30 GCP Platform Implementations Standalone workbench (“GenoMedium”)  Eclipse Rich Client Platform (RCP) Web-based workbench (“Koios”)  AJAX, PHP, Java (server side), Java Web Start NCGR Integrated SYStem (ISYS) Direct tool integration (e.g. GCP MaxdLoad)

31 http://moby.generationcp.org

32 GCP Web-Based Search Engine http://koios.generationcp.org GCP semantics defined query Summary of query hits List of items matched View details at 3 rd party web site or in locally invoked 3 rd party data viewer

33 (Partial) Inventory of 3 rd Party Analysis/Viewer Software being targeted for GCP Integration ToolPurpose SoapLab2Remote computational services access TavernaBioinformatics work flow management ApolloGenome sequence browser CytoscapeVisualization of networks ATVPhylogenetic tree visualization JalViewComparative sequence alignments TMEVMicroarray data analysis EASE, MapmanGene functional annotation CMTVComparative mapping and QTL MAXDLoad & MAXDViewMicroarray data management GDPC tools (Browser,Tassel)Genomic diversity analysis

34 GCP “Pantheon” Project in CropForge http://cropforge.org/projects/pantheon/

35 Closing Perspective The GCP is a global consortium of 22++ crop research partners who need to share diverse large data sets and tools, in a globally distributed manner. Given the scope and duration of the GCP, developers within the consortium embraced the task of developing public global informatics standards for interoperability and integration. The effort is an open source, global community building exercise. We welcome the participation of any and all interested scientists and developers who might wish to use and/or contribute to the further evolution and application of these standards.


Download ppt "1111 The Generation Challenge Programme (GCP) Platform for Crop Research Richard Bruskiewich and the rest of …"

Similar presentations


Ads by Google