Tripal in the Legume Genomics Community January 11 th, 2015 Tripal Workshop Ethy Cannon Iowa State University.

Slides:



Advertisements
Similar presentations
1 POPcorn: Project Portal for corn A set of project and sequence-indexed data searching resources.
Advertisements

1 POPcorn: Project Portal for corn A set of project and sequence-indexed data searching resources ( Jack M. Gardiner Poster.
M2 – Explain the tools and techniques used in the creation of an interactive website. By Arturas Vitkovskij.
Sook Jung, Taein Lee, Stephen Ficklin, Kate Evans, Cameron Peace and Dorrie Main.
How to use GDR, the Genome Database for Rosaceae Sook Jung, Stephen Ficklin, Taein Lee, Chun-Huai Cheng, Anna Blenda, Jing Yu, Sushan Ru, Kate Evans, Cameron.
GDR, the Genome Database for Rosaceae, in Chado and Tripal Sook Jung, Stephen Ficklin, Taein Lee, Chun-Huai Cheng, Anna Blenda, Sushan Ru, Ping Zheng,
Lacey-Anne Sanderson A Toolkit for Construction of Genomic and Genetic Websites.
GDR/CottonGen: Converting legacy sites to Tripal Sook Jung, Jing Yu, Taein Lee, Chun-Huai Cheng, Stephen Ficklin, Dorrie Main.
Background Current Status Future Plans. Agenda Background First Steps Current Status Future Plans Joomla Basics Questions 2.
Calendar Browser is a groupware used for booking all kinds of resources within an organization. Calendar Browser is installed on a file server and in a.
1 Elisa Kendall October 15, 2009 Requirements & Initial Steps Towards an OOR for Standards Management.
The Metadata Problem IMT 589 January 14, /14/2006IMT589- Applied and Structural Metadata2 Metacrap People lie People are lazy People are stupid.
We are developing a web database for plant comparative genomics, named Phytome, that, when complete, will integrate organismal phylogenies, genetic maps.
Ten years of GDR Current Resources and Functionality S Jung, T Lee, S Ficklin, CH Cheng, P Zheng, A Blenda, S Ru, K Evans, C Peace, N Oraguzie, AG Abbott,
GDR What’s New and What’s Next Dorrie Main, Sook Jung, Stephen Ficklin, Taein Lee, Chun-Huai Cheng, Anna Blenda, Jing Yu, Ping Zheng, Sushan Ru, Julia.
New Data and Functionality of GDR, the Genome Database for Rosaceae Sook Jung, Taein Lee, Stephen Ficklin, Chun-Huai Cheng, Ping Zheng, Anna Blenda, Sushan.
Using the Drupal Content Management Software (CMS) as a framework for OMICS/Imaging-based collaboration.
Annual SERC Research Review - Student Presentation, October 5-6, Extending Model Based System Engineering to Utilize 3D Virtual Environments Peter.
GenSAS: Genome Sequence Annotation Server, a Tool for Online Annotation and Curation Dorrie Main, Taein Lee, Ping Zheng, Sook Jung, Stephen P. Ficklin,
Introduction to NRSP databases and other breeding databases.
Building Database Resources For Translational Research in Rosaceae Sook Jung, Taein Lee, Stephen Ficklin, Chun-Huai Cheng, Anna Blenda, Sushan Ru, Ping.
Testing the XCRI-CAP Standard on Course Advertising Information at the University of Worcester Viv Bell University of Worcester MMU Meeting Monday 7 th.
Jing Yu 1, Sook Jung 1, Chun-Huai Cheng 1, Stephen Ficklin 1, Taein Lee 1, Ping Zheng 1, Don Jones 2, Richard Percy 3, Dorrie Main 1 1. Washington State.
Gramene Objectives Develop a database and tools to store, visualize and analyze data on genetics, genomics, proteomics, and biochemistry of grass plants.
Introducing NRSP10 Database Infrastructure for Specialty Crops Computer Applications in Horticulture/Teaching Methods Workshop ASHS Annual Conference 2015.
Lacey-Anne Sanderson A Toolkit for Construction of Genomic and Genetic Websites.
Sharing Research Data Globally Alan Blatecky National Science Foundation Board on Research Data and Information.
We connect helpful tutors with hardworking students Alex Wang Erin Singer Pat Briggs Elia Ahadi.
Melissa Armstrong – Sponsor Dr. Eck Doerry – Mentor Greg Andolshek Alex Koch Michael McCormick Department of Computer Science SolutionProblemDesign User.
European Interoperability Architecture e-SENS Workshop : Collecting data for the Cartography Tool 7-8 January 2015.
Jing Yu, Sook Jung, Chun-Huai Cheng, Stephen Ficklin, Ping Zheng, Taein Lee, Richard Percy, Don Jones, Dorrie Main.
Jing Yu 1, Sook Jung 1, Chun-Huai Cheng 1, Stephen Ficklin 1, Taein Lee 1, Ping Zheng 1, Don Jones 2, Richard Percy 3, Dorrie Main 1 1. Washington State.
Webinar of the CoP 15 September Webinar Agenda 2 StartTopic 14:00Welcome 14:10Overview of the mappings of the ISA Core Vocabularies 14:20Common.
1 Lotus Connections Enables Better Student Education at a Large Metropolitan School District.
GDR in Drupal facilitating community building and efficient maintenance.
Jing Yu 1, Sook Jung 1, Chun-Huai Cheng 1, Stephen Ficklin 1, Taein Lee 1, Ping Zheng 1, Don Jones 2, Richard Percy 3, Dorrie Main 1 1. Washington State.
Electronic Scriptorium, Ltd. AIIM Minnesota Chapter Metadata and Taxonomy Presentation Copyright Electronic Scriptorium, Ltd. All rights reserved, 1991.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
Gramene Objectives Provide researchers working on grasses and plants in general with a bird’s eye view of the grass genomes and their organization. Work.
Metadata with MMI Opening the Door to Collaboration John Graybeal, Luis Bermudez, Philip Bogden, Steven Miller, Stephanie Watson.
GMOD Meeting August 6-7, 2009 Oxford, UK Scott Cain, PhD. GMOD Project Coordinator Ontario Institute for Cancer Research
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
Updates to the Cool Season Food Legume Genome Database Dorrie Main, Chun-Huai Cheng, Rebecca McGee, Clarice Coyne, Stephen Ficklin, Taein Lee, Sook Jung,
CASE (Computer-Aided Software Engineering) Tools Software that is used to support software process activities. Provides software process support by:- –
A Comparative Mapping Resource for Grains Gramene Navigation Tutorial Gramene v.19.1.
Fire Emissions Network Sept. 4, 2002 A white paper for the development of a NSF Digital Government Program proposal Stefan Falke Washington University.
A Tripal based Arthropod genome portal The i5k A Tripal based Arthropod genome portal Christopher Childers USDA/ARS/NAL i5k.nal.usda.gov.
Digital Data Preservation: a schema-driven model Student: Stacy Kowalczyk Co-Authors: Clare McInerney and Phil Mitchell Digital Data Preservation – the.
Jing Yu, Sook Jung, Chun-Huai Cheng, Taein Lee, Katheryn Buble, Ping Zheng, Jodi L. Humann, Deah McGaughey, Heidi Hough, Stephen P. Ficklin, B. Todd Campbell,
The Bovine Genome Database Abstract The Bovine Genome Database (BGD, facilitates the integration of bovine genomic data. BGD is.
Today… Modularity, or Writing Functions. Winter 2016CISC101 - Prof. McLeod1.
Progress on TripalBIMS Breeding Information Management System in Tripal Sook Jung, Taein Lee, Chun-Huai Chen, Jing Yu, Ksenija Gasic, Todd Campbell, Kate.
Jing Yu 1, Sook Jung 1, Chun-Huai Cheng 1, Stephen Ficklin 1, Taein Lee 1, Ping Zheng 1, Don Jones 2, Richard Percy 3, Dorrie Main 1 1. Washington State.
Developing our Metadata: Technical Considerations & Approach Ray Plante NIST 4/14/16 NMI Registry Workshop BIPM, Paris 1 …don’t worry ;-) or How we concentrate.
5/12/2018 Genome Database for Rosaceae New Data and New Functionality
“Big Data”, tree fruit and the Genome Database for Rosaceae
9/11/2018 Genome Database for Rosaceae Since RGC7
CottonGen: An Up-to-Date Resource Enabling Genetics, Genomics and Breeding Research for Crop Improvement Plant and Animal Genome Conference XXV Jing Yu1,
The Cool Season Food Legume Database: An Integrated Resource for Basic, Translational and Applied Research Dorrie Main, Chun-Huai Cheng, Stephen Ficklin,
Genome Database for Rosaceae
Welcome To Yahoo Customer Support Call Toll-Free :
Welcome To Yahoo Customer Service Call Toll-Free :
the Genome Database for Rosaceae: New Data and Functionality
Plant and Animal Genome Conference XXIV
Updates to the CSFL Genome Database:
for the Cotton Community
Genome Database for Rosaceae:
Supporting High-Performance Data Processing on Flat-Files
CottonGen: Enabling Cotton Research through Big-Data Analysis and Integration Jing Yu, Sook Jung, Chun-Huai Cheng, Taein Lee, Katheryn Buble, Ping Zheng,
New Data and Functionality in NRSP10 Databases
Presentation transcript:

Tripal in the Legume Genomics Community January 11 th, 2015 Tripal Workshop Ethy Cannon Iowa State University

1.A case study of Tripal/Chado 2.A description of two Tripal modules our groups are developing

1.A case study of Tripal/Chado 2.A description of two Tripal modules our groups are developing

A case study with PeanutBase and LegumeInfo PeanutBase is a new resource funded by the Peanut Foundation. Most personnel are at Iowa State University. LegumeInfo is the new implementation of the Legume Information System and is funded by the USDA-ARS. Most personnel are at the National Center for Genomic Resources Both teams share some members.

A case study with PeanutBase and LegumeInfo How can we share development and curation efforts across both websites and both locations?

A case study with PeanutBase and LegumeInfo How can we share development and curation efforts across both websites and both locations? Ruby on Rails?

Development objectives Enable sharing of tool development, curation, and data between our two similar data portals.

Development objectives Enable sharing of tool development, curation, and data between our two similar data portals. Avoid redeveloping existing tools.

Development objectives Enable sharing of tool development, curation, and data between our two similar data portals. Avoid redeveloping existing tools. Address the challenges of genomic/breeding data portals for small communities.

Development objectives Enable sharing of tool development, curation, and data between our two similar data portals. Avoid redeveloping existing tools. Address the challenges of genomic/breeding data portals for small communities. Support efforts toward standard data collection, metadata standards, schema, structures with sharable loaders and viewers.

An overview of our Tripal/Chado experience Created (mostly empty) websites very quickly.

An overview of our Tripal/Chado experience Created (mostly empty) websites very quickly.  Difficult for even experienced developers to learn how to customize Drupal/Tripal, more difficult to write new modules.

An overview of our Tripal/Chado experience Created (mostly empty) websites very quickly.  Difficult for even experienced developers to learn how to customize Drupal/Tripal, more difficult to write new modules.  Chado’s flexibility makes it difficult to work with. – There are multiple ways to load the same data. – It is difficult to write custom loaders that are compatible with Tripal. – Controlled vocabularies for describing the data structures are essential but difficult to develop.

An overview of our Tripal/Chado experience Created (mostly empty) websites very quickly.  Difficult for even experienced developers to learn how to customize Drupal/Tripal, more difficult to write new modules.  Chado’s flexibility makes it difficult to work with. – There are multiple ways to load the same data. – It is difficult to write custom loaders that are compatible with Tripal. – Controlled vocabularies for describing the data structures are essential but difficult to develop. Once we got over the high hill(s), we rather suddenly found that we had useful loaders and viewers that tapped into underlying Tripal functionality and modules that were easy to share between the two websites.

Wish list Chado: standards for loading common types of data (gene models, QTL, et cetera). Tripal/Chado: improved loaders with error checking to help debug data errors. Tripal: improved error reporting for both content management and module development.

Lessons learned Deciding to use Chado does not mean your data will look like other data in Chado.

Lessons learned Deciding to use Chado does not mean your data will look like other data in Chado. It is worthwhile to take the time to do things right; it makes your data and tools more sharable and puts you in a better position to use other people’s data and tools.

Lessons learned Deciding to use Chado does not mean your data will look like other data in Chado. It is worthwhile to take the time to do things right; it makes your data and tools more sharable and puts you in a better position to use other people’s data and tools. Don’t waste resources solving problems that have already been solved even if you don’t completely agree with the solution.

Lessons learned Deciding to use Chado does not mean your data will look like other data in Chado. It is worthwhile to take the time to do things right; it makes your data and tools more sharable and puts you in a better position to use other people’s data and tools. Don’t waste resources solving problems that have already been solved even if you don’t completely agree with the solution. Most important: Tripal/Chado permits productive cross-site and cross- database development, effectively increasing the size of both the LegumeInfo and PeanutBase teams.

1.A case study of Tripal/Chado 2.A description of two Tripal modules our groups are developing.

Tripal extension modules PhyloTree – in development at LegumeInfo (Iliana Toneva, Alex Rice) QTL – in development at PeanutBase, based on QTL module at CoolSeasonLegume.org (Ethy Cannon, Stephen Ficklin, QC by Scott Kalberer)

Tripal extension modules PhyloTree – in development at LegumeInfo (Iliana Toneva, Alex Rice) QTL – in development at PeanutBase, based on QTL module at CoolSeasonLegume.org (Ethy Cannon, Stephen Ficklin, QC by Scott Kalberer)

PhyloTree For viewing phylogenetic trees of gene families.

PhyloTree For viewing phylogenetic trees of gene families. Gene families are helpful for: doing cross-species comparative analysis,

PhyloTree For viewing phylogenetic trees of gene families. Gene families are helpful for: doing cross-species comparative analysis, make it possible for a poorly-characterized species like peanut to take advantage of resources for a well- characterized species like soybean.

PhyloTree

Status Hosted at LegumeInfo. Gene and gene family searches at both LegumeInfo and PeanutBase + homology through gene families link the two websites together. Will be made available to all Tripal installations; the process of meeting Tripal standards has started.

Tripal extension modules PhyloTree – in development at LegumeInfo (Iliana Toneva, Alex Rice) QTL – in development at PeanutBase, based on QTL module at CoolSeasonLegume.org (Ethy Cannon, Stephen Ficklin, QC by Scott Kalberer)

Collecting, loading and displaying QTL data

QTL data and metadata is very complex.

Collecting, loading and displaying QTL data QTL data and metadata is very complex. The Chado schema is general-purpose and highly flexible.

Collecting, loading and displaying QTL data QTL data and metadata is very complex. The Chado schema is general-purpose and highly flexible. No standards and few recommended practices for mapping QTL data onto Chado.

Collecting, loading and displaying QTL data The challenge: the complexity of QTL data and metadata, and the lack of strong standards means the data is collected and displayed differently by each web resource. There is a recomendation, Minimum Information about a QTL or Association Study (MIQAS), animal-centric. Required: create a standard data collection template for plants, based on the MIQAS recommendation and what others are doing now.

Collecting, loading and displaying QTL data “There’s more than one way to do it.” –Perl of Wisdom.

Collecting, loading and displaying QTL data “There’s more than one way to do it.” –Perl of Wisdom. Different QTL information is provided and collected by different communities.

Collecting, loading and displaying QTL data “There’s more than one way to do it.” –Perl of Wisdom. Different QTL information is provided and collected by different communities. QTL data has changed over time.

Collecting, loading and displaying QTL data “There’s more than one way to do it.” –Perl of Wisdom. Different QTL information is provided and collected by different communities. QTL data has changed over time. We tried to find a consensus or “canonical” method, decided to mimic Genomic Database for Rosaceae’s data structure, but still managed to create something different.

Tripal Extension module: QTL module (prototype) Ethy Cannon & Stephen Ficklin

QTL module

Status: We have a prototype which is active at both PeanutBase and LegumeInfo.

QTL module Status: We have a prototype which is active at both PeanutBase and LegumeInfo. Working with SoyBase as well as Tripal folks to define a standard data collection template.

QTL module Status: We have a prototype which is active at both PeanutBase and LegumeInfo. Working with SoyBase as well as Tripal folks to define a standard data collection template. First kickoff meeting to plan the publicly-available Tripal QTL module at PAG. – Sook Jung, Stephen Ficklin, Lacey Sanderson, Ethy Cannon

QTL module Status: We have a prototype which is active at both PeanutBase and LegumeInfo. Working with SoyBase as well as Tripal folks to define a standard data collection template. First kickoff meeting to plan the publicly-available Tripal QTL module at PAG. – Sook Jung, Stephen Ficklin, Lacey Sanderson, Ethy Cannon Input welcome from anyone.

Who We Are PeanutBase Steven Cannon Sudhansu Dash Scott Kalberer LegumeInfo Andrew Farmer Alan Cleary Alex Rice Jugpreet Singh Iliana Toneva Pooja Umale Nathan Weeks Genomic Database for Rosaceae Dorrie Main Sook Jung CoolSeasonFoodLegume Dorrie Main Stephen Ficklin Funding: Peanut Foundation USDA-ARS