Dorrie Main, Jing Yu, Sook Jung, Chun-Huai Cheng, Stephen Ficklin, Ping Zheng, Taein Lee, Richard Percy and Don Jones.

Slides:



Advertisements
Similar presentations
Data overload – Breeding decision-support software to the rescue! S. Jung, Taein Lee, Kate Evans, Cameron Peace, Gennaro Fazio, Sushan Ru, Amy F. Iezzoni,
Advertisements

OBJ. 1 TRAIT AND MARKET SEGMENT BREEDING TARGET ESTABLISHMENT Year 2 Deliverables, Challenges, Year 3 Goals.
Cameron Peace, Washington State University
Sook Jung, Taein Lee, Stephen Ficklin, Kate Evans, Cameron Peace and Dorrie Main.
How to use GDR, the Genome Database for Rosaceae Sook Jung, Stephen Ficklin, Taein Lee, Chun-Huai Cheng, Anna Blenda, Jing Yu, Sushan Ru, Kate Evans, Cameron.
GDR, the Genome Database for Rosaceae, in Chado and Tripal Sook Jung, Stephen Ficklin, Taein Lee, Chun-Huai Cheng, Anna Blenda, Sushan Ru, Ping Zheng,
GDR/CottonGen: Converting legacy sites to Tripal Sook Jung, Jing Yu, Taein Lee, Chun-Huai Cheng, Stephen Ficklin, Dorrie Main.
Integrating Phenotypic Data With Genomic, Genetic and Genotypic Data Using Chado Sook Jung, Taein Lee, Stephen Ficklin, Jing Yu, Dorrie Main.
Ten years of GDR Current Resources and Functionality S Jung, T Lee, S Ficklin, CH Cheng, P Zheng, A Blenda, S Ru, K Evans, C Peace, N Oraguzie, AG Abbott,
GDR What’s New and What’s Next Dorrie Main, Sook Jung, Stephen Ficklin, Taein Lee, Chun-Huai Cheng, Anna Blenda, Jing Yu, Ping Zheng, Sushan Ru, Julia.
New Data and Functionality of GDR, the Genome Database for Rosaceae Sook Jung, Taein Lee, Stephen Ficklin, Chun-Huai Cheng, Ping Zheng, Anna Blenda, Sushan.
GenSAS: Genome Sequence Annotation Server, a Tool for Online Annotation and Curation Dorrie Main, Taein Lee, Ping Zheng, Sook Jung, Stephen P. Ficklin,
Update in GDR, The Genome Database for Rosaceae S Jung, T Lee, S Ficklin, CH Cheng, I Cho, P Zheng, K Evans, C Peace, N Oraguzie, A Abbott, D Layne, M.
GENOME SCANS New QTLs discovered Breeding markers Screening services Marker priorities Crossing scheme Trial MASS Socio-economics information DNA information.
Introduction to NRSP databases and other breeding databases.
Building Database Resources For Translational Research in Rosaceae Sook Jung, Taein Lee, Stephen Ficklin, Chun-Huai Cheng, Anna Blenda, Sushan Ru, Ping.
Jing Yu 1, Sook Jung 1, Chun-Huai Cheng 1, Stephen Ficklin 1, Taein Lee 1, Ping Zheng 1, Don Jones 2, Richard Percy 3, Dorrie Main 1 1. Washington State.
Jing Yu 1, Sook Jung 1, Chun-Huai Cheng 1, Stephen Ficklin 1, Taein Lee 1, Ping Zheng 1, Don Jones 2, Richard Percy 3, Dorrie Main 1 1. Washington State.
Introducing NRSP10 Database Infrastructure for Specialty Crops Computer Applications in Horticulture/Teaching Methods Workshop ASHS Annual Conference 2015.
Lacey-Anne Sanderson A Toolkit for Construction of Genomic and Genetic Websites.
Jing Yu, Sook Jung, Chun-Huai Cheng, Stephen Ficklin, Ping Zheng, Taein Lee, Richard Percy, Don Jones, Dorrie Main.
Jing Yu 1, Sook Jung 1, Chun-Huai Cheng 1, Stephen Ficklin 1, Taein Lee 1, Ping Zheng 1, Don Jones 2, Richard Percy 3, Dorrie Main 1 1. Washington State.
Diversity Bioinformatics Terry Casstevens Institute for Genomic Diversity, Cornell University GMOD Meeting at NESCent Durham, NC – June 29-30, 2006.
GDR in Drupal facilitating community building and efficient maintenance.
Jing Yu 1, Sook Jung 1, Chun-Huai Cheng 1, Stephen Ficklin 1, Taein Lee 1, Ping Zheng 1, Don Jones 2, Richard Percy 3, Dorrie Main 1 1. Washington State.
Jodi Humann, Stephen Ficklin, Taein Lee, Chun-Huai Cheng, Sook Jung, Jill Wegrzyn, David Neale and Dorrie Main An easy to use, web-based solution for specialty.
Gramene Objectives Provide researchers working on grasses and plants in general with a bird’s eye view of the grass genomes and their organization. Work.
NRSP10 Database Resources for Crop Genomics, Genetics and Breeding Research NRSP Crops Breeders Database Needs Focus Group Meeting July 30, 2015 Pullman,
Updates to the Cool Season Food Legume Genome Database Dorrie Main, Chun-Huai Cheng, Rebecca McGee, Clarice Coyne, Stephen Ficklin, Taein Lee, Sook Jung,
Development of a Cotton Marker Database (CMD) for Gossypium genome and genetic research CMD Main Goals Collect and integrate.
Jing Yu, Sook Jung, Chun-Huai Cheng, Taein Lee, Katheryn Buble, Ping Zheng, Jodi L. Humann, Deah McGaughey, Heidi Hough, Stephen P. Ficklin, B. Todd Campbell,
Progress on TripalBIMS Breeding Information Management System in Tripal Sook Jung, Taein Lee, Chun-Huai Chen, Jing Yu, Ksenija Gasic, Todd Campbell, Kate.
Jing Yu 1, Sook Jung 1, Chun-Huai Cheng 1, Stephen Ficklin 1, Taein Lee 1, Ping Zheng 1, Don Jones 2, Richard Percy 3, Dorrie Main 1 1. Washington State.
GDR Workshop Tuesday 21st, 2016 RGC8 2016
5/12/2018 Genome Database for Rosaceae New Data and New Functionality
“Big Data”, tree fruit and the Genome Database for Rosaceae
Resources Available for Fragaria Research through the Genome Database for Rosaceae Dorrie Main, Sook Jung, Chun-Huai Cheng, Stephen Ficklin, Taein Lee,
Breeding Information Management System
9/11/2018 Genome Database for Rosaceae Since RGC7
CottonGen: An Up-to-Date Resource Enabling Genetics, Genomics and Breeding Research for Crop Improvement Plant and Animal Genome Conference XXV Jing Yu1,
The Cool Season Food Legume Database: An Integrated Resource for Basic, Translational and Applied Research Dorrie Main, Chun-Huai Cheng, Stephen Ficklin,
Overview for Breeders Jing Yu, Sook Jung, Chun-Huai Chung, Taein Lee, Ping Zheng, Jodi Humann, Deah McGaughey, Morgan Frank, Kirsten Scott, Heidi Hough,
A Breeders Perspective on using the Breeding Information Management System for Cotton Breeding Todd Campbell, Taein Lee, Sook Jung, Jing Yu, Don Jones.
Genome Database for Rosaceae
the Genome Database for Rosaceae: New Data and Functionality
CottonGen An Online Resource for the Cotton Community
Plant and Animal Genome Conference XXIV
Updates to the CSFL Genome Database:
for the Cotton Community
Updates and Future Direction
CottonGen BIMS for Effective and Efficient Management of Breeding Data
Membership Login/sign in
Using CottonGen for Crop Improvement
Genome Database for Rosaceae:
Membership Login/sign in
Membership Login/sign in
Jing Yu1, Taein Lee1, Sook Jung1, Don C. Jones2, B
Jing Yu, Taein Lee, Sook Jung, Don Jones, Todd Campbell,
Jing Yu, Taein Lee, Sook Jung, Don Jones, Todd Campbell,
BIMS (Breeding Information Management System)
How to Effectively Search and Download Data in CottonGen
CottonGen: Enabling Cotton Research through Big-Data Analysis and Integration Jing Yu, Sook Jung, Chun-Huai Cheng, Taein Lee, Katheryn Buble, Ping Zheng,
New Data and Functionality in NRSP10 Databases
Key CottonGen Features For more information contact:
Viewing Genetic Maps with MapViewer in CottonGen
2016 Beltwide Cotton Conference
Germplasm Overview Page Trait Descriptor Standard Images
Membership Login/sign in
Visualization of Conserved Syntenic Blocks Among Six Cotton Genomes in CottonGen Ping Zheng, Sook Jung, Chun-Huai Cheng, Jing Yu, Heidi Hough, Josh Udall,
For more information contact:
Presentation transcript:

Dorrie Main, Jing Yu, Sook Jung, Chun-Huai Cheng, Stephen Ficklin, Ping Zheng, Taein Lee, Richard Percy and Don Jones

 Introduction What is CottonGen What is CottonGen Data Integration Data Integration  Demo of CottonGen Database Overview Database Overview Current Data, Tools, and Searches for Breeders Current Data, Tools, and Searches for Breeders  Future Work Examples from the Genome Database for Rosaceae Examples from the Genome Database for Rosaceae Toward a complete information management system for breeding Toward a complete information management system for breeding

 A new cotton community database enabling basic, translational and applied cotton research.  Consolidates and expands CottonDB and CMD to include transcriptome, genome sequence and breeding data.  Built using the new open-source, user-friendly, Tripal database infrastructure used by several other databases.

Genetics Breeding Germplasm Diversity Genomics Integrated Data & Tools Basic Science Structure and evolution of genome, gene function, genetic variability, mechanism underlying traits Translational Science QTL /marker discovery, genetic mapping, Breeding values Applied Science Management of breeding data Selection performance comparisons Utilization of DNA information in breeding decisions Integrated Data Facilitates Discovery

Markers - 23,935 genetic markers Markers - 23,935 genetic markers Maps - 49 maps with over 34,559 loci Maps - 49 maps with over 34,559 loci QTLs – 988 QTLs for 25 traits QTLs – 988 QTLs for 25 traits Polymorphism - 2,264 polymorphic SSRs Polymorphism - 2,264 polymorphic SSRs Germplasm – 14,959 germplasm records (14 collections) Germplasm – 14,959 germplasm records (14 collections) Traits - 73,296 trait scores of 6,871 GRIN entries Traits - 73,296 trait scores of 6,871 GRIN entries Sequences - 610,246 sequence records Sequences - 610,246 sequence records References - Nearly 11,000 references References - Nearly 11,000 references CottonGen Gossypium Unigene v1.0 CottonGen Gossypium Unigene v1.0 G. raimondii (D5) genome – BGI & JGI versions G. raimondii (D5) genome – BGI & JGI versions

CMap – 49 maps CMap – 49 maps GBrowse – G. raimondii (BGI & JGI versions) GBrowse – G. raimondii (BGI & JGI versions) FPC – TM-1 contigs from USDA-ARS/TAMU FPC – TM-1 contigs from USDA-ARS/TAMU BLAST Servers - UniProt and nr Proteins, BGI D-genome sequences, dbEST, unigenes, and CottonGen markers (20 datasets) BLAST Servers - UniProt and nr Proteins, BGI D-genome sequences, dbEST, unigenes, and CottonGen markers (20 datasets) retrieve sequences in FASTA format Sequence Retrieval – retrieve sequences in FASTA format

Simple Example 1: Identify germplasm with 2.5% span length >= 1

Search criteria: 2.5% span length >= 1 1.From Navigation Bar, click “Search”, then select “Trait Evaluation” 2.From new window, select “Quantitative Traits” 3. Select “2.5% Span Length”- the minimum & maximum value will show up automatically 4. Change the minimum value =1, then “Submit” query

Select germplasm details

Example 2: Find all germplasm with Deltapine 90 in their pedigree

Default result is the table of all germplasm which has pedigree

Can we use wild card such as “DP*90”? There are many others not using “DP 90” but “DP90” or “DP Acala 90”, etc.

 Curation of Uzbekistan (800) and Chinese (3000) germplasm collection data (identity and descriptors).  Adding data (images, evaluation data) from the USDA-ARS Research Project: “Genotypic and Phenotypic Analysis and Digital Imaging of Accessions in the US National Cotton Germplasm Collection ”  Develop a comprehensive breeders toolbox to assist in breeding decisions

 Genome Database for Rosaceae, established in 2003  Breeding Toolbox initiated in 2008 with support from Industry for the WA Apple and Sweet Cherry Breeding Programs  Further developed with support from the USDA SCRI program projects RosBREED and tfGDR ( )

 Data Private data from WA apple breeding program Private and public breeding data from RosBreed project (apple, strawberry, peach, sweet cherry, tart cherry)  Interface Data Management (Browse, Search and Download) Data Conversion (Generate Input files for Pedimap) Decision Support  Cross Assist  Trait Locus Warehouse  Marker Converter Database and Tools for Breeding Data

 Search/download phenotype Data  Search/download genotype Data  Creating website for each breeding program or group Data Management

34 Phenotypic Data Search

35

Variety Detail Page

37 Genotypic Data Search

 Generate Input files for Pedimap, a tool for exploring and visualizing the flow of phenotypes and alleles through pedigrees Data Conversion

Choose Pedigree

Choose Trait and Modify Values

Generate a File and View in Pedimap

 Cross Assist  Trait Locus Warehouse  Marker Converter Decision Support

A web interface to generate a list of parents and the number of seedlings to get the progeny with desired traits Methods “Phenotype” (uses only phenotypic information of individuals in the dataset), “+Pedigree” (uses both phenotypic and pedigree information) “+Ped+DNA” (uses phenotypic, pedigree information and information provided by DNA- based functional genotypes). What is Cross Assist?

Step 1: Select Method

Step 2: Select target number and trait thresholds

Step 3: Filter results by data completeness, required number of seedlings, and parentage

Trait Locus Warehouse

Design new markers by exploring the genome around QTLs QTLs that are anchored to the reference genome can be viewed in GBrowse by clicking Genomic Location. Click neighboring or co-localized markers to view details of markers and view in GBrowse. In GBrowse, retrieve sequences around features (genes and markers) of interest using the sequence retrieval tool GDR-Primer3 tool can be used to design primers with the retrieved sequences. Marker Converter

Search For QTL in Marker Converter

OR Forward QTLs from Trait Locus Warehouse

 Go to GBrowse for genome-anchored QTLs Results

OR go to QTL/Marker pages

 View associated resequencing reads, genes, SNPs and other markers. GBrowse

OR OR follow directions in ‘RosBREED Resequencing Alignments’ tab to view the alignments on IGV (Integrative Genomics Viewer) ( View alignments between reads and the reference genome

Retrieve sequences from selected feature

Sequence Retrieval Tool

Design New Primers in GDR-Primer3

Future Rosaceae Breeders Toolbox Development Data RosBreed QTLs and their genome positions More breeding data and DNA based functional genotypes More re-sequencing data Functionality Data management: online data submission and editing Viewing data on screen and generating report pages Decision support tools Cross Assist: to accommodate more complex situations (selfing, cross compatibility, etc) To upload users’ own data Further develop more tools

 Development of a complete cotton breeding information database system FieldLab Local BIDS CottonGen

Main Lab team who work on CottonGen Taein Lee Dr. Stephen Ficklin Chun-Huai Cheng Dr. Ping Zheng Dr. Jing Yu Acknowledgements Dr. Sook Jung

 The CottonGen Steering Committee  Industry Funding Cotton Incorporated, Bayer CropScience, Dow/Phytogen, Monsanto, Association of Agricultural Experiment Station Directors  Government Funding USDA ARS USDA NIFA AFRI and SCRI programs (funding Mainlab Tripal, Rosaceae Breeders Toolbox and GenSAS Development)  University Support Washington State University, Texas A&M, Clemson University  Community of Cotton Researchers