Fission Yeast Computing Workshop -1- Searching, querying, browsing downloading and analysing data using PomBase Basic PomBase Features Gene Page Overview.

Slides:



Advertisements
Similar presentations
Annotation of Gene Function …and how thats useful to you.
Advertisements

SG KB 2009 NIGMS Workshop: Enabling Technologies for Structural Biology Section on Structural Analysis Margaret J. Gabanyi March 4, 2009 How to Use the.
Rama Balakrishnan Saccharomyces Genome Database Stanford University
1 Welcome to the Protein Database Tutorial This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
Fission Yeast Computing Workshop -1- Exercise 5: Looking for overreprsented GO terms in a gene set using Onto-Express GO annotations can be used to obtain.
Gene Ontology John Pinney
WormBase Workshop: 2015 International C. elegans Meeting Tools & Resources InterMine / WormMine – Chris Grove JBrowse – Scott Cain The WormBase Ontology.
DEMO CSE fall. What is GeneMANIA GeneMANIA finds other genes that are related to a set of input genes, using a very large set of functional.
An introduction to using the AmiGO Gene Ontology tool.
NGS Analysis Using Galaxy
>>> Korean BioInformation Center >>> KRIBB Korea Research institute of Bioscience and Biotechnology GS2PATH: Linking Gene Ontology and Pathways Jin Ok.
1 DADS Program Briefing DADSII Update & Product Demonstration Meeting SDC Meeting February 2010.
Viewing & Getting GO COST Functional Modeling Workshop April, Helsinki.
Tutorial 1 Getting Started with Adobe Dreamweaver CS3
Tutorial 1: Getting Started with Adobe Dreamweaver CS4.
Fission Yeast Computing Workshop -1- Searching, querying, browsing downloading and analysing data using GeneDB and the Gene Ontology annotation Basic searching.
1 Welcome to the Quantitative Trait Loci (QTL) Tutorial This tutorial will describe how to navigate the section of Gramene that provides information on.
July 2015 CSHL Data analysis: GO tools and YeastMine, use-case examples.
Website Development with Dreamweaver
1 Welcome to the GrameneMart Tutorial A tool for batch data sequence retrieval 1.Select a Gramene dataset to search against. 2.Add filters to the dataset.
The aims of the Gene Ontology project are threefold: - to compile vocabularies to describe components, functions and processes - to produce tools to query.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent 2 Overview of Genome Browsers Materials prepared by Warren C. Lathe, Ph.D.
Moodle (Course Management Systems). Glossaries Moodle has a tool to help you and your students develop glossaries of terms and embed them in your course.
Managing Data Modeling GO Workshop 3-6 August 2010.
Regulatory Genomics Lab Saurabh Sinha Regulatory Genomics Lab v1 | Saurabh Sinha1 Powerpoint by Casey Hanson.
SRI International Bioinformatics 1 Object Groups & Enrichment Analysis Suzanne Paley Pathway Tools Workshop 2010.
Visualization and analysis of microarray and gene ontology data with treemaps Eric H Baehrecke, Niem Dang, Ketan Babaria and Ben Shneiderman Presenter:
Copyright OpenHelix. No use or reproduction without express written consent1.
Drinking Water Infrastructure Needs Survey and Assessment 2007 Website.
An Internet of Things: People, Processes, and Products in the Spotfire Cloud Library Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist.
Grup.bio.unipd.it CRIBI Genomics group Erika Feltrin PhD student in Biotechnology 6 months at EBI.
Organizing information in the post-genomic era The rise of bioinformatics.
BLAST Basic Local Alignment Search Tool (Altschul et al. 1990)
Copyright OpenHelix. No use or reproduction without express written consent1.
DAVID Genome Biol. 2003;4(5):P3 Analysis of gene lists using DAVID
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Fission Yeast Computing Workshop -1- Getting the most from the fission yeast genome data: A computing workshop WT Sanger Institute WT Genome Campus Hinxton.
SRI International Bioinformatics 1 SmartTables & Enrichment Analysis Peter Karp SRI Bioinformatics Research Group September 2015.
Motif discovery and Protein Databases Tutorial 5.
Copyright OpenHelix. No use or reproduction without express written consent1.
From Tech Support with love Susan, Luisa and Nick.
Regulatory Genomics Lab Saurabh Sinha Regulatory Genomics | Saurabh Sinha | PowerPoint by Casey Hanson.
Generic Database. What should a genome database do? Search Browse Collect Download results Multiple format Genome Browser Information Genomic Proteomic.
Statistical Testing with Genes Saurabh Sinha CS 466.
Getting GO: how to get GO for functional modeling Iowa State Workshop 11 June 2009.
An Introduction to NHS Evidence
Tutorial 8 Gene expression analysis 1. How to interpret an expression matrix Expression data DBs - GEO Clustering –Hierarchical clustering –K-means clustering.
Tools in Bioinformatics Genome Browsers. Retrieving genomic information Previous lesson(s): annotation-based perspective of search/data Today: genomic-based.
Accessing and visualizing genomics data
Protein databases Petri Törönen Shamelessly copied from material done by Eija Korpelainen and from CSC bio-opas
The Genome Genome Browser Training Materials developed by: Warren C. Lathe, Ph.D. and Mary Mangan, Ph.D. Part 2.
Welcome to the GrameneMart Tutorial A tool for batch data sequence retrieval 1.Select a Gramene dataset to search against. 2.Add filters to the dataset.
Biocomputational Languages December 1, 2011 Greg Antell & Khoa Nguyen.
The TDR Targets Database Prioritizing potential drug targets in complete genomes.
Getting GO annotation for your dataset
NGS Analysis Using Galaxy
Regulatory Genomics Lab
Using ArrayExpress.
Sequence based searches:
Department of Genetics • Stanford University School of Medicine
ID Mapping tools: Converting Accessions between Databases
Welcome to the Quantitative Trait Loci (QTL) Tutorial
Victor M. Markowitz, I-Min A. Chen, Ken Chu, Amrita Pati, Natalia N
United Kingdom SDGs Reporting Platform
Regulatory Genomics Lab
Welcome to the GrameneMart Tutorial
SRI Bioinformatics Research Group
Welcome - webinar instructions
Regulatory Genomics Lab
Presentation transcript:

Fission Yeast Computing Workshop -1- Searching, querying, browsing downloading and analysing data using PomBase Basic PomBase Features Gene Page Overview Configuring Browser “Others” links QuickGO browser Homepage links Pombelist list Advanced seach Genome regions (quicklinks to centromeres etc) GO slim Genome Browser Genome statistics Gene characterization FAQ Simple data mining and analysis Create user defined gene sets and Download gene sets in various formats Combine (union, intersect and subtract) to make and refine user defined lists “GO slimming” GO enrichment”

Fission Yeast Computing Workshop -2- Using the “Advanced Query Tool” to create and download some gene sets (Advanced search under the “Find” menu) The “Query Results” tab allows you to browse and download your search results Exercise 1: Create a protein data set and import gene list Select Gene Filters Genes by type, “protein coding”

Fission Yeast Computing Workshop -3- Query history The “Query History” tab stores previous searches and allows you to union, intersect and subtract them. From your protein coding gene set : 1)Subtract the union of “Annotation status” “dubious” and “transposon”. This gives you the set of protein coding genes for fission yeast 2) Subtract annotation status “published” to give you the “unpublished” protein coding gene set 3) Intersect these results with the GO term “nucleus” 4) Intersect these results with phenotype “viable” 5) Go to the “Gene List Search” in the FIND menu and import a list of your own, or the list provided here: ftp/pub/yeast/pombe/EMBO/test_list This imported list will appear in the query history, this enables you to perform these intersections on any user defined list. 6) Intersect your uploaded list with the output of 5 to find the genes in your “user defined list” which are unstudied in fission yeast and annotated as nuclear 7) Download the list for slimming exercise later (you only need the systematic identifiers in column1)

Fission Yeast Computing Workshop -4- Exercise 2: Creating defined gene sets This “gene characterization” data is avaiable here in the PomBase website: (and previous totals under “characterization history, left hand margin) You can recreate this data in the Advanced Query query using queries for “Annotation status”. Select the Annotation status “Conserved Unknown” To drill down to the “species distribution” intersect your “conserved unknown” list with “conserved in vertebrates”, “conserved in fungi only” (under “Other Vocabularies” “Conserved in…..”) Try some more queries How many proteins have an identified Pfam domain or family assignment? How many non coding RNAs are there between bases 100,000 and 200,000 on chromosome 1? How many proteins are longer than 1000 amino acids on the left arm of chromosome 3 ? Are any of these “conserved unknown” ?

Fission Yeast Computing Workshop -5- GO “slimming” A “ slim” is a high level view of GO (genes annotated to granular terms are mapped to higher level terms) Allows users to group genes into broader categories to assess their distribution, for genome wide analyses or smaller gene sets Different Annotation groups (organism databases) have created specific GO slims which are available at GO’s FTP site (fission yeast now has an “official GO slim” which give good coverage of high level processes). You can create and use your own GO slim with high level terms of interest A fission yeast GO slim has been created for process terms This slim gives good coverage of annotated proteins (most annotated proteins are mapped to the slim). This should be suitable for general purposes, but to slim experimental results you may want to change the terms in the slim slightly to best represent your dataset. There are some guidelines for creating user defined slims here: Note: this is not a gene product count, as gene products have multiple annotations; this means that it doesn’t make sense to display this information as a pie chart For most purposes this slim would be inadequate, (the terms are very broad) but it demonstrates “unknown” (unannotated) “other” annotated to some other term in the slim There are usually many more annotations than genes (e.g here). Many genes are annotated to multiple high level terms. A pie chart does not show the percentage of the genome involved in a particular process as it is often used and interpreted. Histograms with absolute numbers on the axis rather than percentages are much more meaningful.

Fission Yeast Computing Workshop -6- Exercise 3 “GO Slimming” This exercise uses the generic “GO slim mapper”at Princeton to create a ‘GO slim distribution from our gene set of interest. Go to (this implementation is always up to date for the ontology and the annotation, and it supplies A list of “unknown genes” which map to the root node, and a list of genes which are annotated to a non-root process but not covered by the slim is provided 1. Upload the protein coding gene list from Exercise 1 Select PomBase GO Slim, 2.User defined GO slim In the advanced options For example, if you wanted to create a “Slim” set for “component terms” You might begin with GO: nucleus GO: mitochondrion GO: plasma membrane GO: Golgi apparatus GO: nucleolus Try this option with your list

Fission Yeast Computing Workshop -7- Exercise 4 “GO Term Enrichment” Using the generic “GO term finder” tool at Princeton to provide an enrichment analysis (significant shared terms) in a gene set of interest. Go to 1. Upload your gene list from the Exercise Select the process ontology 3. Choose the PomBase association file (annotations) 4. Repeat with the Cellular Component ontology The results will show the most significant terms in your gene set, in order of significance. The % in your gene set compared to the % in the genome as a whole is provided, in addition to the P-value