Presentation is loading. Please wait.

Presentation is loading. Please wait.

Carnegie Institution for Science, Department of Plant Biology.

Similar presentations


Presentation on theme: "Carnegie Institution for Science, Department of Plant Biology."— Presentation transcript:

1 Carnegie Institution for Science, Department of Plant Biology

2 Putting TAIR to work for you: Tips and Techniques for Accessing Arabidopsis Data for Plant Biology Research Kate Dreher TAIR, AraCyc, PMN Carnegie Institution for Science

3 Part I: Presentation (with exercises) Finding a specific gene of interest in TAIR Looking at the data on the locus, gene model, and protein pages Getting to know GBrowse Creating and enhancing customized data sets Tips for working with Arabidopsis Part II: Practice problems and individual help Hand-outs with practice problems to work on Questions from participants Individual help All documents are available in electronic form: Resource guide Questions, answers, and practice data Bienvenidos a TAIR presentacion y esta presentacion Overview

4 The Arabidopsis Information Resource (TAIR) maintains a database of genetic and molecular biology data for the model plant Arabidopsis www.arabidopsis.org Curators and programmers at TAIR: Collect, store, and organize Arabidopsis data Attach functional information to genes Improve gene structures Provide tools to analyze data Work with the ABRC to provide seeds and clones What is TAIR?

5 Finding the gene you want Case 1: You have a non-Arabidopsis gene and want to find its homolog http://www.ncbi.nlm.nih.gov/nuccore/148189857?report=genbank Case 2: You know exactly what Arabidopsis gene you want You know the AGI locus code (e.g. At2g46990) You know the gene symbol (e.g. PhyA) Tips and Techniques for Accessing Arabidopsis Data

6 Finding a gene: practice problems You are reading a paper about an interesting phenotype caused by a mutation in the AN gene. Find the AGI locus code of this gene You find an EST that is expressed at high levels in the seed of your Phaseolus vulgaris variety: GenBank: AB304457 (To find gene in GenBank – google NCBI and you should find the page) Find the AGI locus codes of the top three hits in TAIR using BLAST Is it the same if you BLAST with the transcript or the protein? Based on the transcript Based on the protein

7 Finding a gene: practice problems You are reading a paper about an interesting phenotype caused by a mutation in the AN gene. Find the AGI locus code of this gene AT1G01510 (a.k.a. ANGUSTIFOLIA) You find an EST that is expressed at high levels in the seed of your Phaseolus vulgaris variety: GenBank: AB304457 Find the AGI locus codes of the top three hits in TAIR using BLAST Is it the same if you BLAST with the transcript or the protein? Based on the transcript AT1G14920.1 | Symbols: GAI, RGA2 | GAI (GIBBERELLIC ACID IN... 62 3e-08 AT3G03450.1 | Symbols: RGL2 | RGL2 (RGA-LIKE 2); transcript... 44 0.007 AT2G01570.1 | Symbols: RGA1, RGA | RGA1 (REPRESSOR OF GA1-3... 44 0.007 Based on the protein AT1G14920.1 | Symbols: GAI, RGA2 | GAI (GIBBERELLIC ACID IN... 647 0.0 AT2G01570.1 | Symbols: RGA1, RGA | RGA1 (REPRESSOR OF GA1-3... 632 0.0 AT1G66350.1 Symbols: RGL1, RGL | RGL1 (RGA-LIKE 1

8 Choosing the proper search result Locus Gene Model Protein

9 The Locus page: Lots of information

10 Looking at the Locus page: practice problems 1 Youre interested in learning more about a gene called: PMR2 (Powdery Mildew Resistant 2) What is its AGI locus code? How many splice variants does it have? Which one has the shorter coding region? What is another name for this gene? What is the evidence for it being involved in the defense response to fungus, incompatible interaction? How many total loci are annotated to this term? Which paper provides experimental evidence that PMR2 is located in the plasma membrane? What is the title of that paper?

11 Looking at the locus page: practice problems 1 Youre interested in learning more about a gene called PMR2: Powdery Mildew Resistant 2 What is its AGI locus code? At1g11310 How many splice variants does it have? 2 Which one has the shorter coding region? At1g11310.2 What is another name for this gene? Mildew Resistant Locus 2 (MLO2) What is the evidence for it being involved in the defense response to fungus, incompatible interaction? Inferred from Mutant Phenotype; analysis of visible trait; Consonni 2005 How many total loci are annotated to this term? 44 Which paper provides experimental evidence that PMR2 is located in the plasma membrane? Benschop 2007 What is the title of that paper? Quantitative phospho-proteomics of early elicitor signalling in Arabidopsis.

12 Looking at the locus page: practice problems 2 Youre interested in learning more about a gene called: PMR2 (Powdery Mildew Resistant 2) How many cDNAs are associated with this locus? Which are available to order from the ABRC? What is the length of the full-length coding region? What is the isoelectric point of the protein? For the PERL0025782 polymorphism, what is the nucleotide difference between the Col and Bor-4 ecotypes?

13 Looking at the locus page: practice problems 2 Youre interested in learning more about a gene called: PMR2 (Powdery Mildew Resistant 2) How many cDNAs are associated with this locus? 3 Which are available to order? none What is the length of the full-length coding region? 1722 bp What is the isoelectric point of the protein? 9.8492 For the PERL0025782 polymorphism, what is the nucleotide difference between the Col-0 and Bor-4 ecotypes? Col

14 Looking at the locus page: practice problems 3 Youre interested in learning more about a gene called: PMR2 (Powdery Mildew Resistant 2) Does the pmr2-1 mutant form lesions in response to powdery mildew attack? What is the putative location of the T-DNA insertion in mlo2-6? What is the ecotype of SAIL_878_H12? How many publications are available for this gene for 2007? Which paper also mentions the PMR3 gene? How many papers mention the mlo2 allele/ mutant when you do a Textpresso search?

15 Looking at the locus page: practice problems 3 Youre interested in learning more about a gene called: PMR2 (Powdery Mildew Resistant 2) Does the pmr2-1 mutant form lesions in response to powdery mildew attack? no What is the putative location of the T-DNA insertion in mlo2-6? intron What is the ecotype of SAIL_878_H12? Col-0 How many articles and how many abstracts are available for this gene for 2007? 2 abstracts, 1 article Which paper also mentions the PMR3 gene? Isolation and characterization of powdery mildew-resistant Arabidopsis mutants PNAS 2000 How many papers mention the mlo2 allele/ mutant when you do a Textpresso search? 8

16 Locus page links: practice problems Youre interested in learning more about a gene called: PMR2 (Powdery Mildew Resistant 2) According to the Genevestigator Gene Atlas, which organ has the highest level of expression? According to the Genevestigator Response viewer, was the level of PMR2 transcript higher 1 hr or 4 hrs after treatment with the fungal elicitor FL22? According to the eFP site, are the absolute levels of PMR2 expression higher in the root or the shoot of a seedling, 6 hours after a cold treatment? In the SUBA database, where does the MS/MS data indicate that this protein is located? According to InParanoid, how many poplar genes fall into the same group? On the AT-TED II page, how many genes are directly linked to PMR2 by co- expression analysis, and which has the strongest correlation?

17 Locus page links: practice problems Youre interested in learning more about a gene called: PMR2 (Powdery Mildew Resistant 2) According to the Genevestigator Gene Atlas, which organ has the highest level of expression? senescent rosette leaf According to the Genevestigator Response viewer, was the level of PMR2 transcript higher 1 hr or 4 hrs after treatment with the fungal elicitor FL22? It is higher 1 hour after treatment According to the eFP site, are the absolute levels of PMR2 expression higher in the root or the shoot of a seedling, 6 hours after a cold treatment? They are higher in the root In the SUBA database, where does the MS/MS data indicate that this protein is located? plasma membrane According to InParanoid, how many poplar genes fall into the same group? 2 On the AT-TED II page, how many genes are directly linked to PMR2 by co- expression analysis, and which has the strongest correlation? 5, At2g44180 is the strongest

18 Do we need anything besides the locus, gene model, and protein pages?

19 How many Papaya genes are found in the same cluster as PMR2 in Phytozome? How many Vitis vinifera genes?

20 Basic navigation and tools in GBrowse Use controls to zoom and scroll along chromosome Get sequence Enter locus, marker, etc. ***Many tracks now contain data from the TAIR9 release on Monday, June 22

21 GBrowse = Gobs of Information x x

22 GBrowse: practice problems How many papaya homologs are displayed from Phytozome? And how many amino acids are in the putative ortholog that has the Mlo domain? There are two upstream regulatory regions located upstream of this gene? Which one has been linked to the a cis element in rice? Which of the following has a longer transcript assembly aligning with PMR2? Saccharum officinarum or Triticum aestivum? Solanum tuberosum or Vitis vinifera? Are there any experimentally supported phosphorylation sites? What polymorphism appears to occur in the 5 th intron? Is there peptide support for the third exon? the fourth exon? the fifth exon? And which gene model is supported by peptide evidence? Which exon structure seems to be better supported by the Brassica cDNA? by the Radish clones?

23 GBrowse: practice problems How many papaya homologs are displayed from Phytozome? And how many amino acids are in the putative ortholog that has the Mlo domain? 2; 350 amino acids There are two upstream regulatory regions located upstream of this gene? Which one has been linked to the a cis element in rice? AtREG417 Which of the following has a longer transcript assembly aligning with PMR2? Saccharum officinarum or Triticum aestivum? Triticum aestivum Solanum tuberosum or Vitis vinifera? Solanum tuberosum Are there any experimentally supported phosphorylation sites? Yes, from the motif: SVENYPSSPSPR What polymorphism appears to occur in the 5 th intron? PERL0025787 Is there peptide support for the third exon? the fourth exon? the fifth exon? And which gene model is supported by peptide evidence? third – yes; fourth – no, fifth – yes; the At1g11310.1 model is supported Which exon structure seems to be better supported by the Brassica cDNA? by the Radish clones? the At1g11310.1 model is better supported by both types of transcripts

24 Scientists often want to work with more than one gene or protein that are related through some common feature TAIR (and the PMN) offer some basic tools to create and/or enhance these customized data sets Sometimes, one gene isnt enough...

25 Data sets can be based on many different criteria: Overall sequence alignment (DNA or protein) Sequence motifs (DNA or protein) Protein domains and biochemical properties Gene/Protein function Subcellular location Molecular function Biological process Expression pattern Biochemical pathway Mapping region Phenotype Gene families Creating customized data sets How do you generate these data sets?

26 Creating data sets: practice problems How many DNA stocks are associated with NPR1? Do any of them that are available from the ABRC have full length cDNAs? How many keywords contain the term oxalate? How many of them have been used to annotate Arabidopsis genes? How many germplasms are associated with a reduced seed set phenotype? How many genes encode proteins that are found in the chloroplast stroma based on a direct assay? Try to get the calculated PIs for all the chloroplast stroma proteins and find the highest and lowest values. How many proteins have the following domain Gly-Arg-Ala-Asn-hydrophobic residue (GRAN[hydrophilic])?

27 Creating data sets: practice problems How many DNA stocks are associated with NPR1? Do any of them that are available from the ABRC have full length cDNAs? 11; yes, the two stocks available from the ABRC have full-length cDNAs How many keywords contain the term oxalate? How many of them have been used to annotate Arabidopsis genes? 11 keywords; two have been used for Arabidopsis How many germplasms are associated with a reduced seed set phenotype? 68 How many genes encode proteins that are found in the chloroplast stroma based on a direct assay? 396 loci Try to get the calculated PIs for all the chloroplast stroma proteins and find the highest and lowest values. 4.25, 12.66 How many proteins have the following domain Gly-Arg-Ala-Asn-hydrophobic residue (GRAN[hydrophilic])? 32

28 Putting TAIR to work for you Use TAIR to find detailed information for a specific gene / protein Locus page, gene model page, protein page Many sections, many data types, many external links GBrowse Many tracks Use TAIR to create and enhance customized data sets Specific and Advanced Search pages Motif analysis tools FTP files with large data sets Use TAIR for data visualization and analysis GO categorization (TAIR) OMICs viewer (PMN) If youre having trouble getting any information you want from TAIR...

29 We are here to help: www.arabidopsis.org Please use our data Please use our tools Please use TAIR to help improve your research on IMPORTANT plants! Please contact us if we can be of any help! Make an appointment to meet with me during my visit (Puedo tratar de hablar en español) curator@arabidopsis.org www.arabidopsis.org

30 Thank you! TAIR, AraCyc, and the PMN Current Curators: - Tanya Berardini (lead curator – functional annotation) - David Swarbreck (lead curator – structural annotation) - Peifen Zhang (Director and lead curator- metabolism) - A. S. Karthikeyan (curator) - Philippe Lamesch (curator) - Donghui Li (curator) - Rajkumar Sasidharan (curator) Recent Past Contributors: - Debbie Alexander (curator) - Christophe Tissier (curator) - Hartmut Foerster (curator) Tech Team Members: - Bob Muller (Manager) - Larry Ploetz (Sys. Administrator) - Raymond Chetty - Anjo Chi - Vanessa Kirkup - Cynthia Lee - Tom Meyer - Shanker Singh - Chris Wilks Metabolic Pathway Software: - Peter Karp and SRI group Eva Huala (Director and Co-PI) Sue Rhee (PI and Co-PI)

31 We are here to help: www.arabidopsis.org Please use our data Please use our tools Please use TAIR to help improve your research on IMPORTANT plants! Please contact us if we can be of any help! Make an appointment to meet with me during my visit (Puedo tratar de hablar en español) curator@arabidopsis.org www.arabidopsis.org


Download ppt "Carnegie Institution for Science, Department of Plant Biology."

Similar presentations


Ads by Google