Development of the Generation Challenge Program Ontology for Crops Elizabeth Arnaud (Bioversity International) and Rosemary Shrestha (CRIL-CIMMYT), Richard Bruskiewich (IRRI) TDWG 2008 Annual Conference, October 2008 Fremantle, Western Australia
TDWG annual conference, October 2008, Fremantle, Western Australia The Generation Challenge Programme Science for better crops in the tropics For the majority of crop farmers in the developing world, the ravages of drought, low soil fertility, crop pests and diseases are aggravated by their limited access to improved crops. For the majority of crop farmers in the developing world, the ravages of drought, low soil fertility, crop pests and diseases are aggravated by their limited access to improved crops.
TDWG annual conference, October 2008, Fremantle, Western Australia The Generation Challenge Programme Science for better crops in the tropics By using advances in molecular biology and harnessing the rich global stocks of crop genetic resources, the Generation CP creates and provides a new generation of plants that meet farmer needs. By using advances in molecular biology and harnessing the rich global stocks of crop genetic resources, the Generation CP creates and provides a new generation of plants that meet farmer needs. Consultative Group on International Agricultural Research (CGIAR)
TDWG annual conference, October 2008, Fremantle, Western Australia GCP subprograms SP1- Genetic Diversity of Global Genetic Resources SP2 - Genomics towards gene discovery SP3 - Trait Capture for Crop Improvement SP4 - Bioinformatics and Crop Information Systems Building an 'integrated platform' of molecular biology and bioinformatics tools = Molecular breeding platform SP5 - Capacity Building and Enabling Delivery
TDWG annual conference, October 2008, Fremantle, Western Australia The Generation Challenge Programme Target areas Drought-prone environments Mandate crops All the CGIAR mandate crops = 22 crops Commissioned and competitive projects 275 projects in 5 years
GCP New Challenge initiatives Cereals 1. Rice/drought/Africa 2. Wheat/drought/Asia 3. Sorghum/drought/Africa 4. Rice-Sorghum-Maize/soil problem/Asia & Africa Legumes 5. Cowpeas/drought/Africa 6. Chickpeas/drought/Africa and Asia Root and tubers Root and tubers 7. Cassava/virus/Africa
TDWG annual conference, October 2008, Fremantle, Western Australia Integration across diverse crop datasets Volume and complexity of biological data is increasing Volume and complexity of biological data is increasing Historical data are scattered in numerous crop specific databases Historical data are scattered in numerous crop specific databases Each database uses slightly different terminologies for terms related to phenotypes Each database uses slightly different terminologies for terms related to phenotypes
TDWG annual conference, October 2008, Fremantle, Western Australia Integration across Diverse GCP Crop Data Anatomical Developmental Field Performance Stress Response Genotype Germplasm Phenotyp e Molecular Expressio n Environment Inventory Identification (passport) Genealogy Genetic Maps Physical Maps DNA Sequence Functional Annotation Molecular Variation (Natural or Induced) Location (GIS) Climate Day Length Ecosystem Agronomy Stresses Transcripteome Proteome Metabolome Physiology has determines affects SP3 SP2 SP1
TDWG annual conference, October 2008, Fremantle, Western Australia An integrated platform for molecular breeding To support and encourage researchers to share and reuse information among agricultural databases To form the basis for the generation of data templates, web services and software.
TDWG annual conference, October 2008, Fremantle, Western Australia GCP Scientific Domain Model Germplasm identification (“passport") and pedigree data Phenotypic characterization and evaluation data Geographic location and environmental descriptions Genotype and molecular data Genomic map data for markers and loci Functional genomics data
TDWG annual conference, October 2008, Fremantle, Western Australia The exchange of new findings and joint work on projects presuppose that all those involved have the same understanding of the terms they use. This calls the need for an extensively standardized description of plant development stages with phenological characteristics and coding. Prof. Dr. F. Klingauf President of the Federal Biological Research Centre for Agriculture and Forestry, Berlin and Braunschweig
TDWG annual conference, October 2008, Fremantle, Western Australia Importance of crop ontology Similar plant structures are described by their species- specific terms. Fruit Kernel in Maize Grain in Wheat Pod in Beans Grain or caryopsis in Rice
TDWG annual conference, October 2008, Fremantle, Western Australia The GCP Ontology "Thesaurus" of biological concepts that can be shared and used across species to which genetic and phenotypic data can be associated "Thesaurus" of biological concepts that can be shared and used across species to which genetic and phenotypic data can be associated integrative data mining on GCP annotated data using the platform and web services integrative data mining on GCP annotated data using the platform and web services Developed with crop experts, for plant structure, developmental stages, traits and expression of the traits Developed with crop experts, for plant structure, developmental stages, traits and expression of the traits for selected priority GCP crops: Wheat, Maize, Sorghum, Chickpea, Banana & Plantain for selected priority GCP crops: Wheat, Maize, Sorghum, Chickpea, Banana & Plantain
TDWG annual conference, October 2008, Fremantle, Western Australia GCP Sources for mapping the terms International Crop Information Systems International Crop Information Systems ICIS model ( ) IMIS (maize) IMIS (maize) IRIS (rice) IRIS (rice) IWIS (wheat) IWIS (wheat) Musa germplasm information system ( ) ICRISAT information system (Sorghum, chickpea) CIP information system (potato) Crop descriptors for traits (Bioversity International) Crop descriptors for traits (Bioversity International) GCP data templates GCP data templates GCP datasets GCP datasets
TDWG annual conference, October 2008, Fremantle, Western Australia GCP Ontology
TDWG annual conference, October 2008, Fremantle, Western Australia Developing the GCP ontology GCP crop ontology mapping Plant Structure ontology Trait Ontology GCP concept ID PO concept ID & TO concept ID DBXref Data annotation with GCP ontology GCP data Templates Crop DB
TDWG annual conference, October 2008, Fremantle, Western Australia GCP ontology term has: Term: plant height ID: GCP_322* Namespace:maize_trait Definition: Measurement of plant height from soil surface to the highest point in plant. Synonyms:PHT, PTHT, Planth. Shoot height Dbxrefs:PO:10202TO: , IMIS_TRAITID:1008 is_a:GCP_
TDWG annual conference, October 2008, Fremantle, Western Australia Building ontology with OBO.Edit Terms are linked by the relationships such as Terms are linked by the relationships such as is-a is-a part-of part-of has-a has-a disjoint from disjoint from derived from, etc. derived from, etc. It is structured as a hierarchical directed acyclic graph (DAG) It is structured as a hierarchical directed acyclic graph (DAG) Terms can have more than one parent and zero, one or more children Terms can have more than one parent and zero, one or more children Draft releases of the OBO formatted ontology files for rice, wheat and maize trait are available at
TDWG annual conference, October 2008, Fremantle, Western Australia %HSATIVUM_TILLER1_FLAG_1 Complex trait name Description: The trait is scored for severity of the disease caused by Helminthosporium sativum (leaf spot) at tiller 1 and flag 1 stage in percentage. Complex trait's names created by breeders in the crop databases to be decomposed into simple terms that are readable for both human and computer and mapped against Ontology
Plant Ontology Qualities & Units Ontology Assessment Methods Ontology (e.g. ICIS) PATO Qualifier Phenoptype “values” have “units” (units implicitly indicates attribute) Plant structure Development stages Markers/alleles/sequence ontology Genotype Factor (G) EFFECTS Treatment, Location, Climatic variables /water, Growth conditions, Stress Management/agronomy External environmental data (E) Time Ontology Temporal factor (T) Experimental design Experiment factor (ED) Ontology for Crops phenotypic qualities
TDWG annual conference, October 2008, Fremantle, Western Australia GCP Ontology – present and future prospects: GCP Ontology Data Source CGIAR GCP data templates GCP Domain Module Ontology ICIS dataset Taxonomic Ontology Plant Anatomy & Development Ontology Phenotype & Trait Ontology Structural & Functional Genomic Ontology Location & Environment Ontology General Science Ontology Web Interface (Chado/koios) Query Linkage to external ontologies Present Status Future Plan General Germplasm Ontology
TDWG annual conference, October 2008, Fremantle, Western Australia
TDWG annual conference, October 2008, Fremantle, Western Australia Thank you ! Crops' Harvest Celebration San Isidoro Feria Lucban, Philippines