Part I: Tips and techniques from curators Kate Dreher TAIR, AraCyc, PMN Carnegie Institution for Science.

Slides:



Advertisements
Similar presentations
Model Organism Databases and Community Annotation
Advertisements

A Comparative mapping resource ONTOLOGY DEVELOPMENT AND INTEGRATION IN GRAMENE Pankaj Jaiswal Cornell University.
Carnegie Institution for Science, Department of Plant Biology.
Making best use of TAIR tools and datasets Philippe Lamesch Donghui Li The Arabidopsis Information Resource contact us:
Bienvenidos al PMN! Kate Dreher curator PMN/TAIR.
TAIR: Bringing together data for the global plant biology community Philippe Lamesch Kate Dreher The Arabidopsis Information Resource
1 Gene Ontology and Functional Annotation Donghui Li ASPB Plant Biology, June 29, 2008, Merida.
Bienvenidos a TAIR! Kate Dreher curator TAIR/PMN.
Extracting information from scientific papers: Challenges and Opportunities for Researchers and Curators DPB.
How pathway databases were created and curated Peifen Zhang Plant Metabolic Network (PMN)
Annotation of Gene Function …and how thats useful to you.
TAIR: Bringing together data for the global plant biology community kate dreher curator TAIR/PMN.
The Arabidopsis Information Resource (TAIR)
Arabidopsis as a model for plant development Eva Huala.
Gene Structure Annotation Philippe Lamesch International Arabidopsis conference July 23, 2008, Montreal.
Kate Dreher AraCyc, TAIR, PMN Carnegie Institution for Science
Putting TAIR to work for you hands-on workshop for beginning and advanced users
El PMN: Tu amigo en el metabolismo de plantas Kate Dreher curator PMN/AraCyc/TAIR.
Linkage and Genetic Mapping
The Plant Metabolic Network: PlantCyc, AraCyc, and NEW Metabolic Pathway Databases for Plant Research *K. Dreher, P. Zhang, L. Chae, R.A. Nilo Poyanco,
1 Welcome to the Protein Database Tutorial This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
Introduction to the Plant Metabolic Network: 18 Databases and Omics-Level Tools for Analysis and Discovery kate dreher The Carnegie Institution for Science.
POC tutorial#3: Annotation This tutorial will run automatically in Quicktime. To run the tutorial at your own pace use the internal controllers within.
Copyright OpenHelix. No use or reproduction without express written consent1 Organization of genomic data… Genome backbone: base position number sequence.
TRANSFAC Project Roadmap Discussion.  Structure DNA-binding domain (DBD)  The portion (domain) of the transcription factor that binds DNA Trans-activating.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Update on The Pathway Tools Software Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org MetaCyc.org.
Accessing the Data You Need at the Plant Metabolic Network kate dreher biocurator PMN The Carnegie Institution for Science Stanford, CA.
HC70AL Spring 2009 Gene Discovery Laboratory RNA and Tools For Studying Differential Gene Expression During Seed Development 4/20/09tratorp.
TAIR resources for plant biology research kate dreher curator TAIR/PMN.
1 SRI International Bioinformatics BioCyc Tutorial Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org,
Viewing & Getting GO COST Functional Modeling Workshop April, Helsinki.
Kate dreher biocurator / plant molecular biologist The Carnegie Institution for Science Stanford, CA Introduction to the Plant Metabolic Network: Data.
1 Welcome to the Quantitative Trait Loci (QTL) Tutorial This tutorial will describe how to navigate the section of Gramene that provides information on.
Introduction to Gene Mining Part B: How similar are plant and human versions of a gene? After completing part B, you will demonstrate How to use NCBI BLASTp.
New data and tools at TAIR (The Arabidopsis Information Resource)
Accessing information in plant metabolic pathway databases at the PMN, Gramene, and SGN Part I: Contents, Search Strategies, and Data Sharing Opportunities.
TAIR/Gramene/SGN Workshop I ASPB Meeting July 08, 2007 Chicago, IL Metabolic Databases.
TAIR Workshop Model Organism Databases and Community Annotation Plant and Animal Genome XVI Conference, San Diego January 13, 2008.
Copyright OpenHelix. No use or reproduction without express written consent 2 Overview of Genome Browsers Materials prepared by Warren C. Lathe, Ph.D.
SRI International Bioinformatics 1 Recent Developments in Pathway Tools GMOD Workshop November ‘07 Suzanne Paley Bioinformatics Research Group SRI International.
BioHealthBase: A Web-based Database and Analysis Resource for Francisella Shubhada Godbole 1, Jyothi Noronha 1, Burke Squires 1, Victoria Hunt 1, Ed Klem.
A Comparative Genomics Resource for Grains. Tutorial Tips If you are viewing this tutorial with Adobe Acrobat Reader, click the "bookmarks" on the left.
SRI International Bioinformatics 1 Object Groups & Enrichment Analysis Suzanne Paley Pathway Tools Workshop 2010.
Copyright OpenHelix. No use or reproduction without express written consent1.
PlantCyc, AraCyc, PoplarCyc and more... Building databases and connecting to researchers at the Plant Metabolic Network kate dreher curator PMN/TAIR.
MetaCyc and AraCyc: Plant Metabolic Databases Hartmut Foerster Carnegie Institution.
Community Interactions: Feedback, Support and Curation Eva Huala The Arabidopsis Information Resource (TAIR)
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Top Four Essential TAIR Resources Debbie Alexander Metabolic Pathway Databases for Arabidopsis and Other Plants Peifen Zhang.
SRI International Bioinformatics 1 SmartTables & Enrichment Analysis Peter Karp SRI Bioinformatics Research Group September 2015.
Combining Computational Prediction and Manual Curation to Create Plant Metabolic Pathway Databases Peifen Zhang Carnegie Institution For Science Department.
Metabolic Pathway Databases and Tools Speaker and Schedule Update PMN (Peifen Zhang) KEGG (auto-slide show) MetaCrop (cancelled)
1 Gene function annotation. 2 Outline  Functional annotation  Controlled vocabularies  Functional annotation at TAIR  Resources and tools at TAIR.
Copyright OpenHelix. No use or reproduction without express written consent1.
Rice Proteins Data acquisition Curation Resources Development and integration of controlled vocabulary Gene Ontology Trait Ontology Plant Ontology
Biological Networks & Systems Anne R. Haake Rhys Price Jones.
Development and Use of Controlled Vocabularies at the Arabidopsis Information Resource (TAIR) Sue Rhee Carnegie Institution Dept. Plant Biology
Copyright OpenHelix. No use or reproduction without express written consent1.
GeWorkbench Overview Support Team Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute of MIT and Harvard.
Building and Refining AraCyc: Data Content, Sources, and Methodologies Kate Dreher TAIR, AraCyc, PMN Carnegie Institution for Science.
Welcome to Gramene’s RiceCyc (Pathways) Tutorial RiceCyc allows biochemical pathways to be analyzed and visualized. This tutorial has been developed for.
1 AraCyc Metabolic Pathway Annotation. 2 AraCyc – An overview  AraCyc is a metabolic pathway database for Arabidopsis thaliana;  Computational prediction.
2006 ICAR: TAIR workshop Organizers: Katica Ilic and Peifen Zhang Location: Reception Room, 4th floor A general overview of TAIR website and demonstration.
Protein databases Petri Törönen Shamelessly copied from material done by Eija Korpelainen and from CSC bio-opas
Genomes at NCBI. Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools lists 57 databases.
Department of Genetics • Stanford University School of Medicine
Welcome to the world of two Arabidopsis genes:
Welcome to the Quantitative Trait Loci (QTL) Tutorial
Part II SeqViewer AraCyc Help
Presentation transcript:

Part I: Tips and techniques from curators Kate Dreher TAIR, AraCyc, PMN Carnegie Institution for Science

Scientists often want to work with more than one gene or protein that are related by a common feature TAIR (and the PMN) offer some basic tools to create customized data sets (e.g. lists of genes or proteins) to add more information to data sets to analyze data sets Sometimes, one gene isnt enough...

Data sets can be based on many different criteria: Overall sequence alignment (DNA or protein) Sequence motifs (DNA or protein) Protein domains and biochemical properties Gene/Protein function Subcellular location Molecular function Biological process Expression pattern Biochemical pathway Mapping region Phenotype Gene families Creating customized data sets using TAIR and the PMN How do you generate these data sets?

Creating customized data sets: TAIR Data sets can be obtained using several strategies at TAIR Advanced search pages Data-mining tools

Creating customized data sets: PMN Data sets can be obtained using several strategies at the PMN What is the PMN? It is the home of AraCyc – the Arabidopsis metabolic pathway database The Plant Metabolic Network (PMN) maintains a set of metabolic pathway databases for Arabidopsis and other plants Provides tools to analyze metabolic data Generates new metabolic pathway databases for crops and other important plants

Pathway Enzyme Gene Reaction Compound Evidence Code AraCyc Pathway pages contain several types of data Metabolic pathway data in AraCyc at the PMN

Pathway pages contain curated comments and useful links Metabolic pathway data in AraCyc at the PMN

Creating customized data sets: PMN Data sets can be obtained using several strategies at the PMN Advanced search page Data-mining tools (*coming soon*) Metabolic pathway pages

Additional information can be obtained for your data set Enhancing customized data sets Bulk data retrieval tool FTP files

You have mapped a mutation that disrupts flower development to a region of Chromosome 1 What are some good candidates in the mapping interval? Get a list of all the genes in the mapping interval and find candidates involved in flower development Find all the associated gene function (GO) and expression (PO) annotations for the candidate genes Obtain gene confidence scores for all associated gene models to choose sequence for complementation Customized data sets: case studies

Get a list of all the genes in a mapping interval involved in flower development Customized data sets: Flower development PVV4.1NCC1

Customized data sets: Flower development AT1G09000 MAP kinase kinase kinase activity, cellular component unknown, embryo, flower, flower development, kinase activity, leaf, petal differentiation and expansion stage, response to oxidative stress, root, seed, shoot apex, whole plant, D bilateral stage, E expanded cotyledon stage, F mature embryo stage, Choose gene models to express for complementation experiments...

Customized data sets: Flower development Obtain gene confidence scores for all associated gene models

You work on a transcription factor that affects jasmonic acid biosynthesis Do JA biosynthetic genes share common sequences in their promoters? Obtain a list of all the genes involved in JA biosynthesis Get upstream promoter sequences Search for over-represented DNA sequences in promoters Creating customized data sets

Customized data sets: JA biosynthesis jasmonic acid

Customized data sets: JA biosynthesis Take this gene list to TAIR... to get upstream sequences

Customized data sets: JA biosynthesis Get upstream promoter sequences

Customized data sets: JA biosynthesis Search for over-represented or prevalent DNA sequences in promoters Use the Motif Analyzer in TAIR to identify common 6-mers AT1G69490 AT1G48270 AT1G11870 AT1G12820

Creating customized data sets You are studying a protein with an exciting new domain: Thr-x-Ala-x-Ile-x-Arg Are there other TxAxIxR proteins? Do they share additional domains? Find all of the proteins that have the TxAxIxR domain Identify all of the other domains found in those proteins

Customized data sets: TxAxIxR proteins Find all of the proteins that have the TxAxIxR domain

Customized data sets: TxAxIxR proteins Identify all of the other domains found in those proteins

Analyzing data sets Sometimes you want to analyze data sets We have a few analysis tools: Analyze = DISPLAY data in a visual manner with a few statistics Data must be pre-cleaned If you want to display quantitative metabolic data on genes, enzymes or compounds OMICS viewer If you want to look for over-represented annotations for a list of genes or proteins All the genes up-regulated in a mutant All of the proteins found in the ovule GO categorization tool

GO categorization Classify your list of genes/proteins using GO annotations

GO categorization

... or use a tool at AmiGO (on hand-out)

Putting TAIR and the PMN to work for you Use TAIR to find detailed information for specific genes or proteins Locus page, gene model page, protein page Many sections, many data types, many external links GBrowse Many tracks New gene confidence scores as part of TAIR9 release Use TAIR and the PMN to generate and work with customized data sets Create and add data to lists of proteins and genes Specific and Advanced Search pages Motif analysis tools FTP files with large data sets Visualize and analyze data OMICs viewer (PMN) GO categorization (TAIR) If youre having trouble getting any information you want...

We are here to help! Please visit us and ask questions at the Curation Booth! Workshop Part II: Practice sets and individual help

Acknowledgements TAIR, AraCyc, and the PMN Current Curators: - Tanya Berardini (lead curator – functional annotation) - David Swarbreck (lead curator – structural annotation) - Peifen Zhang (Director and lead curator- metabolism) - A. S. Karthikeyan (curator) - Philippe Lamesch (curator) - Donghui Li (curator) - Rajkumar Sasidharan (curator) Recent Past Contributors: - Debbie Alexander (curator) - Christophe Tissier (curator) - Hartmut Foerster (curator) Tech Team Members: - Bob Muller (Manager) - Larry Ploetz (Sys. Administrator) - Raymond Chetty - Anjo Chi - Vanessa Kirkup - Cynthia Lee - Tom Meyer - Shanker Singh - Chris Wilks Metabolic Pathway Software: - Peter Karp and SRI group Eva Huala (Director and Co-PI) Sue Rhee (PI and Co-PI)

Part I: Tips and techniques from curators Bonus slides... Kate Dreher TAIR, AraCyc, PMN Carnegie Institution for Science

Customized data sets: Flower development Find all the associated GO terms and PO terms and get evidence codes

Obtain a list of all the genes involved in JA biosynthesis Customized data sets: JA biosynthesis

Another option Use pathway page

OMICs Viewer

Customized data sets: JA biosynthesis Experimental results provide a more detailed sequence: (A or T)C(A or C or G)TCGGT(G or T)A