El PMN: Tu amigo en el metabolismo de plantas Kate Dreher curator PMN/AraCyc/TAIR.

Slides:



Advertisements
Similar presentations
SRI International Bioinformatics 1 A BRG Biofuels Metabolic Engineering Project Bioinformatics Research Group SRI International
Advertisements

Bienvenidos al PMN! Kate Dreher curator PMN/TAIR.
Extracting information from scientific papers: Challenges and Opportunities for Researchers and Curators DPB.
How pathway databases were created and curated Peifen Zhang Plant Metabolic Network (PMN)
Annotation of Gene Function …and how thats useful to you.
TAIR: Bringing together data for the global plant biology community kate dreher curator TAIR/PMN.
The Arabidopsis Information Resource (TAIR)
Arabidopsis as a model for plant development Eva Huala.
Kate Dreher AraCyc, TAIR, PMN Carnegie Institution for Science
Part I: Tips and techniques from curators Kate Dreher TAIR, AraCyc, PMN Carnegie Institution for Science.
Applications of GO. Goals of Gene Ontology Project.
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
The Plant Metabolic Network: PlantCyc, AraCyc, and NEW Metabolic Pathway Databases for Plant Research *K. Dreher, P. Zhang, L. Chae, R.A. Nilo Poyanco,
Introduction to the Plant Metabolic Network: 18 Databases and Omics-Level Tools for Analysis and Discovery kate dreher The Carnegie Institution for Science.
CACAO - Remote training Gene Function and Gene Ontology Fall 2011
Introduction to the Pathway Tools Software David Walsh and Simon Eng bigDATA Workshop—May 29, 2010.
CACAO - Remote training Gene Function and Gene Ontology Fall 2011
BICH CACAO Biocurator Training Session #3.
Update on The Pathway Tools Software Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org MetaCyc.org.
Accessing the Data You Need at the Plant Metabolic Network kate dreher biocurator PMN The Carnegie Institution for Science Stanford, CA.
1 SRI International Bioinformatics Advanced PGDB Editing: Regulation GO Terms Ingrid M. Keseler Bioinformatics Research Group SRI International
TAIR resources for plant biology research kate dreher curator TAIR/PMN.
1 SRI International Bioinformatics BioCyc Tutorial Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org,
1 SRI International Bioinformatics The Pathway Tools Software and BioCyc Database Collection Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International.
Kate dreher biocurator / plant molecular biologist The Carnegie Institution for Science Stanford, CA Introduction to the Plant Metabolic Network: Data.
Introduction to Gene Mining Part B: How similar are plant and human versions of a gene? After completing part B, you will demonstrate How to use NCBI BLASTp.
Data Content of the BioCyc Databases. BioCyc Tier 1 Databases.
Sequence Databases What are they and why do we need them.
New data and tools at TAIR (The Arabidopsis Information Resource)
CACAO Training Fall Community Assessment of Community Annotation with Ontologies (CACAO)
Accessing information in plant metabolic pathway databases at the PMN, Gramene, and SGN Part I: Contents, Search Strategies, and Data Sharing Opportunities.
TAIR/Gramene/SGN Workshop I ASPB Meeting July 08, 2007 Chicago, IL Metabolic Databases.
TAIR Workshop Model Organism Databases and Community Annotation Plant and Animal Genome XVI Conference, San Diego January 13, 2008.
SRI International Bioinformatics 1 Recent Developments in Pathway Tools GMOD Workshop November ‘07 Suzanne Paley Bioinformatics Research Group SRI International.
PlantCyc, AraCyc, PoplarCyc and more... Building databases and connecting to researchers at the Plant Metabolic Network kate dreher curator PMN/TAIR.
MetaCyc and AraCyc: Plant Metabolic Databases Hartmut Foerster Carnegie Institution.
1 SRI International Bioinformatics GO Term Integration and Curation in Pathway Tools and EcoCyc Ingrid M. Keseler Bioinformatics Research Group SRI International.
Monday, November 8, 2:30:07 PM  Ontology is the philosophical study of the nature of being, existence or reality as such, as well as the basic categories.
Top Four Essential TAIR Resources Debbie Alexander Metabolic Pathway Databases for Arabidopsis and Other Plants Peifen Zhang.
From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of.
Manual GO annotation Evidence: Source AnnotationsProteins IEA:Total Manual: Total
Introduction to the GO: a user’s guide Iowa State Workshop 11 June 2009.
SRI International Bioinformatics 1 Submitting pathway to MetaCyc Ron Caspi.
24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?
Getting Started: a user’s guide to the GO GO Workshop 3-6 August 2010.
Pathway Database Pathway Comparison Expression Viewer Discovery. Pankaj Jaiswal Oregon State University 1.
Combining Computational Prediction and Manual Curation to Create Plant Metabolic Pathway Databases Peifen Zhang Carnegie Institution For Science Department.
Metabolic Pathway Databases and Tools Speaker and Schedule Update PMN (Peifen Zhang) KEGG (auto-slide show) MetaCrop (cancelled)
1 Gene function annotation. 2 Outline  Functional annotation  Controlled vocabularies  Functional annotation at TAIR  Resources and tools at TAIR.
Getting Started: a user’s guide to the GO TAMU GO Workshop 17 May 2010.
Amino Acid Biosynthesis By Laura Voss. Biosynthesis vs. Metabolism Not the same as amino acid metabolism pathways. –Synthesis of most amino acids is only.
CACAO Training Fall Community Assessment of Community Annotation with Ontologies (CACAO)
Introduction to the GO: a user’s guide NCSU GO Workshop 29 October 2009.
PlantCyc, AraCyc, PoplarCyc and more... Building databases with YOUR help at the Plant Metabolic Network kate dreher curator PMN/TAIR.
Update Susan Bridges, Fiona McCarthy, Shane Burgess NRI
A database of biological pathways and processes (borrowed from a presentation created by Steve Jupe)
SRI International Bioinformatics Update your computers! To install a patch: Tools => Instant Patch => Download and Activate All Patches.
SRI International Bioinformatics 1 Editing Pathway/Genome Databases Ron Caspi.
Building and Refining AraCyc: Data Content, Sources, and Methodologies Kate Dreher TAIR, AraCyc, PMN Carnegie Institution for Science.
Welcome to Gramene’s RiceCyc (Pathways) Tutorial RiceCyc allows biochemical pathways to be analyzed and visualized. This tutorial has been developed for.
1 AraCyc Metabolic Pathway Annotation. 2 AraCyc – An overview  AraCyc is a metabolic pathway database for Arabidopsis thaliana;  Computational prediction.
Copyright OpenHelix. No use or reproduction without express written consent1 1.
CACAO Training ASM-JGI 2012.
Introduction to the Gene Ontology
The Pathway Tools Software and BioCyc Database Collection
Advanced PGDB Editing: Regulation GO Terms
Strategies for annotation of a genome
Advanced PGDB Editing: Gene Ontology (GO) Terms
Welcome to Gramene’s RiceCyc (Pathways) Tutorial
Part II SeqViewer AraCyc Help
Presentation transcript:

El PMN: Tu amigo en el metabolismo de plantas Kate Dreher curator PMN/AraCyc/TAIR

PMN = The Plant Metabolic Network Created in 2008 Funded by the National Science Foundation www. plantcyc.org How do I find information? How do I build a metabolic pathway database for my favorite plant? Putting the PMN to use Sue Rhee (PI) Peifen Zhang (Director)

Choose a database... Use the basic search tools Finding the information you want...

Basic searching: practice problems 1. What are some of the pathways that talk about salt stress in the PMN? 2. How many amides are in PlantCyc? 3. Are there more PlantCyc or AraCyc reactions listed under the E.C. category 3 (hydrolases)?

Basic searching: practice problems 1. What are some of the pathways that talk about salt stress in the PMN? Arabidopsis thaliana proline biosynthesis I Measurement of free proline content and gene expression and enzyme activity levels from salt-stress treated plants [ Roosens98 ] showed that in young... Arabidopsis thaliana choline biosynthesis I Arabidopsis thaliana choline biosynthesis I Summers93: Summers PS, Weretilnyk EA (1993). "Choline Synthesis in Spinach in Relation to Salt Stress." Plant Physiol 103(4); PMID: Arabidopsis thaliana arginine biosynthesis III Arabidopsis thaliana arginine biosynthesis III "Isolation of the ornithine-delta-aminotransferase cDNA and effect of salt stress on its expression in Arabidopsis thaliana." Plant Physiol 117(1); PlantCyc phosphatidylcholine biosynthesis I PlantCyc phosphatidylcholine biosynthesis I "Regulation of phosphatidylcholine biosynthesis under salt stress involves choline kinases in Arabidopsis thaliana." FEBS Lett 566(1-3); PlantCyc putrescine biosynthesis via N-carbamoylputrescine PlantCyc putrescine biosynthesis via N-carbamoylputrescine... and biosynthetic gene expression in Arabidopsis thaliana under salt stress.... knockout mutation of ADC2 gene reveals inducibility by osmotic stress How many amides are in PlantCyc? 9 3. Are there more PlantCyc or AraCyc reactions listed under the E.C. category 3 (hydrolases)? More AraCyc reactions

Basic searching: practice problems Find a pathway in PlantCyc whose end product is tryptophan. Is it part of any superpathways? How many anthinilate synthases from Medicago truncatula are associated with this pathway? Is there experimental or computational support for their assignment to this reaction? Which enzyme is subject to feedback regulation? Please give gene code, too. What type of experimental support exists for this enzyme? What is its Km for chorismate? Find the protein sequence for the closest ortholog in papaya What is the length of the full-length coding sequence for its closest homolog in soybean (in soybeans)? How many species are associated with reaction ? Find a paper that talks about an enzyme with similar activity in Zea mays? What is the chemical formula of anthranilate? What are two synonyms for this compound? What other pathway is it found in?

Basic information: practice problems Find a pathway in PlantCyc whose end product is tryptophan. TRPSYN-PWY – tryptophan biosynthesis Is it part of any superpathways? YES, superpathway of phenylalanine, tyrosine and tryptophan biosynthesis superpathway of phenylalanine, tyrosine and tryptophan biosynthesis How many anthinilate synthases from Medicago truncatula are associated with this pathway? 4 Is there experimental or computational support for their assignment to this reaction? Computational Which enzyme is subject to feedback regulation? anthranilate synthase, At5g05730 What type of experimental support exists for this enzyme? mutant phenotype; genetic interaction What is its Km for chorismate? 180 uM Find the protein sequence for the closest ortholog in papaya MQTLGFSYRLVPSGRRFSPVPANGISGARSSSLVNVRTFKCMSLSSPSLVCDVKKFAEASKHGNVVPLYH SIFSDQLTPVLAYRCLVKEDDREAPSFLFESVEPGFQASSVGRYSVVGAQPTIEIVAKENKVTIMDHEGG TLSEEYVQDPMMIPRRISEGWKPQLIDELPDTFCGGWVGYFSYDTIRYVEKKKLPFSMAPEDDRNLADIH LGLYDDVIVFDHVEKKAHVIHWVRLDQYSSAEKAYNDGLKRLEKLVAKVQDIDPPRLSPGSVDLQTRQFG PSLRKSTMTSEEYKMAVLEAKEHILAGDIFQIVLSQRFERRTFADPFEVYRALRVVNPSPYMTYLQARGC ILVASSPEILTRVEKKKIVNRPLAGTVRRGKTTAEDEMLEKQLLNDAKQCAEHIMLVDLGRNDVGKVTGE LHDHLTCWDVLRAALPVGTVSGAPKVKAMELIDQLEVTRRGPYSGGFGGISFTGNMDVALALRTIVFPTG THYNTMYSYKDVENRRDWIAHLQAGAGIVADSNPDDENQECHNKVAGLARAIDLAESAFVNK* What is the length of the full-length coding sequence for its closest homolog in soybean? 1593 bp How many species are associated with reaction ? three Find a paper that talks about an enzyme with similar activity in Zea mays? Some physical characteristics of the enzymes of L-tryptophan biosynthesis in higher plants What is the chemical formula of anthranilate? C7H7NO2 What are two synonyms for this compound? anthranilic acid, 2-aminobenzoic acid What other pathway is it found in? acridone alkaloid biosynthesis

Advanced searching Advanced Query Form Downloading the whole database

Advanced searching: practice problems Find a list of all of the enzymes with a pI between 5 and 6.5 that are glucosyltransferases. What species are they found in? What is the highest pI and the lowest pI in the group? Get a list of all of the reactions that are children of E.C and identify the compounds in the right and the left How many reactions are in your original list? If you eliminate all of the reactions that have rhamnose in their name, how many do you have left? Look for all the pathways that have more than one hypothetical reaction in them. Get all of the citations associated with them How many of the pathways have more than 4 hypothetical reactions associated with them?

Advanced searching: practice problems Find a list of all of the enzymes in PlantCyc with a pI between 5 and 6.5 that are glucosyltransferases. How many are there? What species are they found in? What is the highest pI and the lowest pI in the group? Get a list of all of the reactions that are children of E.C in PlantCyc and identify the compounds in the right and the left in your output list How many reactions are in your original list? If you eliminate all of the reactions that have rhamnose in their name, how many do you have left? Look for all the pathways in PlantCyc that have more than one hypothetical reaction in them. Get all of the citations associated with them How many of the pathways have more than 4 hypothetical reactions associated with them?

Advanced searching: practice problems Find a list of all of the enzymes in PlantCyc with a pI between 5 and 6.5 that are glucosyltransferases. How many are there? 13 What species are they found in? Zea mays, Dorotheanthus bellidiformis, Cicer arietinum, Brassica napus Zea maysDorotheanthus bellidiformisCicer arietinumBrassica napus Pinus strobus What is the highest pI and the lowest pI in the group? 5, 6.36 Get a list of all of the reactions that are children of E.C in PlantCyc and identify the compounds in the right and the left in your output list How many reactions are in your original list? 189 If you eliminate all of the reactions that have rhamnose in their name, how many do you have left? 174 Look for all the pathways in PlantCyc that have more than one hypothetical reaction in them. 17 Get all of the citations associated with them How many of the pathways have more than 4 hypothetical reactions associated with them? 6

OMICs viewer Do your experiment (or get your data) Create your input file Upload and analyze your data Absolute values Time course Relative values

OMICs viewer: practice problems Use PMN_taller_OMICS_practice_problem The columns are: 0 – gene identifiers 1 – time point 1 2 – time point 2 3 – time point 3 4 – rapid change (ratio of time point 1 to 2) Find the pathways that have gene expression values of over 600 at time point 2. Which one has the highest number of steps affected in the pathway? Create an animation of all three time points Look for pathways that had genes whose expression changed more than 4-fold between the first and last time points. How many appear? Use the data you have already calculated and look for pathways that have gene expression changes of more than 6-fold in the first time interval. How many appear?

OMICs viewer: practice problems Use PMN_taller_OMICS_practice_problem The columns are: 0 – gene identifiers 1 – time point 1 2 – time point 2 3 – time point 3 4 – rapid change (ratio of time point 1 to 2) Find the pathways that have gene expression values of over 600 at time point 2. Which one has the highest number of steps affected in the pathway? starch degradation Create an animation of all three time points Look for pathways that had genes whose expression changed more than 4-fold between the first and last time points. How many appear? 13 Use the data you have already calculated and look for pathways that have gene expression changes of more than 6-fold in the first time interval. How many appear? 6

Building your own metabolic database Step 1: Genes / Nucleotide Sequences Genome sequencing Unigene builds ESTs Step 2: Annotate Predicted Proteins Annotation source: PMN pipeline JGI or other sequencing consortium Small group of dedicated scientists, etc. Annotation tasks: Give enzymes a name Add GO molecular function terms Assign to a MetaCyc or EC reaction Use Pathologic and MetaCyc or PlantCyc to create new pathway database Remove non-enzymatic proteins before prediction After first round of prediction, review unassigned enzymes Repredict Validate new database

Validating your own metabolic database PathoLogic errs on the side of over-prediction Curators / Scientists validate pathways...

Examine predicted pathways Search for evidence in published papers, books, etc. Is the pathway described in the literature for my species? Are the crucial metabolites described in in the literature for my species? Are there unique reactions associated with the pathway that have assigned genes? Is there evidence that this is a universal plant pathway? Is it on the PMN green list? Is there evidence that this pathway is NEVER found in plants? Only relevant when prediction is made using MetaCyc Is it on the PMN black list?

Make necessary changes Remove pathways not found in My species glycogen biosynthesis C4 photosynthesis caffeine biosynthesis Edit pathways operating via a different route Phenylalanine biosynthesis in bacteria vs. My species

Make necessary changes AraCyc Pathway: phenylalanine biosynthesis Edit pathways operating via a different route

Increase the data content Add pathways from My species not present in the reference database Secondary metabolites Add additional compounds, reactions, and enzymes from My species that cannot be put in a known pathway Write new summaries or revise imported summaries for pathways

Provide evidence codes Evidence codes are used for: Pathways Enzyme activities General types of support EV-EXP -> experimental EV-COMP -> computational EV-IC -> inferred by a curator EV-AS -> author statement Additional information about evidence can be captured: EV-EXP-IDA-PURIFIED-PROTEIN Inferred from direct assay EV-EXP-IMP-REACTION-BLOCKED Inferred from mutant phenotype EV-COMP-AINF Inferred by computation automatically without human oversight

Provide evidence codes Enzyme activity assays Radiotracer experiments

Wed love to have you as part of our Plant Metabolic Network of experts and friends! Add your database to the PMN The PMN can host it as a separate individual species or multi-species database Enzymes from your validated pathways can be incorporated into PlatCyc pathways Newly curated pathways from your species can be added to PlantCyc Pathways Enzyme activities... but you dont have to wait to make a whole new database!

Please help us to improve the PMN Review existing pathways Volunteer to go over a set of pathways in your area of expertise us any time you see a problem Use our feedback form

Please give us advice and information

Provide new data Submit any information concerning: Pathways Enzymes Reactions Compounds

Please send us new information

We are here to help: Please use our data Please use our tools Please help us to improve our databases! Please contact us if we can be of any help! Make an appointment to meet with me during my visit (Puedo tratar de hablar en español)

PMN Acknowledgements Current Curators: - A. S. Karthikeyan (curator) Recent Past Contributors: - Christophe Tissier (curator) - Hartmut Foerster (curator) Collaborators: - Peter Karp (SRI) - Ron Caspi (SRI) - SRI Tech Team - Lukas Mueller (SGN) - Anuradha Pujar (SGN) - Gramene and MedicCyc Peifen Zhang (Director) Sue Rhee (PI) Eva Huala (Co-PI) Tech Team Members: - Bob Muller (Manager) - Larry Ploetz (Sys. Administrator) - Raymond Chetty - Anjo Chi - Vanessa Kirkup - Cynthia Lee - Tom Meyer - Shanker Singh - Chris Wilks

We are here to help: Please use our data Please use our tools Please help us to improve our databases! Please contact us if we can be of any help! Make an appointment to meet with me during my visit (Puedo tratar de hablar en español)