Managing Data Modeling GO Workshop 3-6 August 2010.

Slides:



Advertisements
Similar presentations
The Maize Inflorescence Project Website Tutorial Nov 7, 2014.
Advertisements

Integration of Protein Family, Function, Structure Rich Links to >90 Databases Value-Added Reports for UniProtKB Proteins iProClass Protein Knowledgebase.
Modeling Functional Genomics Datasets CVM Lesson 3 13 June 2007Fiona McCarthy.
GO-based tools for functional modeling GO Workshop 3-6 August 2010.
Fission Yeast Computing Workshop -1- Exercise 5: Looking for overreprsented GO terms in a gene set using Onto-Express GO annotations can be used to obtain.
Kate Milova MolGen retreat March 24, Microarray experiments: Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
GO Enrichment analysis COST Functional Modeling Workshop April, Helsinki.
2010 Open Market Transfer System User Guide. 2 Objectives Uses of this Guide Understand how to register for the Open Market Transfer System (OMTS). Understand.
Viewing & Getting GO COST Functional Modeling Workshop April, Helsinki.
Strategies & Examples for Functional Modeling
EGAN: Exploratory Gene Association Networks by Jesse Paquette Biostatistics and Computational Biology Core Helen Diller Family Comprehensive Cancer Center.
1 Welcome to the GrameneMart Tutorial A tool for batch data sequence retrieval 1.Select a Gramene dataset to search against. 2.Add filters to the dataset.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Fission Yeast Computing Workshop -1- Searching, querying, browsing downloading and analysing data using PomBase Basic PomBase Features Gene Page Overview.
Log on to the site using your User ID and Password and select journal and click “Log In” Click here to create a new account Click here to check the system.
Adding GO for Large Datasets COST Functional Modeling Workshop April, Helsinki.
Tutorial session 2 Network annotation Exploring PPI networks using Cytoscape EMBO Practical Course Session 8 Nadezhda Doncheva and Piet Molenaar.
Adding GO GO Workshop 3-6 August GOanna results and GOanna2ga 2. gene association files 3. getting GO for your dataset 4. adding more GO (introduction)
Salt Suite User Guide (Copyright Salt ).
1 of 38 Data Mining in Ensembl with BioMart. 2 of 38 Simple Text-based Search Engine.
ITCS373: Internet Technology Lecture 5: More HTML.
Strategies for functional modeling TAMU GO Workshop 17 May 2010.
UBio Training Courses Micro-RNA web tools Gonzalo
DAVID Genome Biol. 2003;4(5):P3 Analysis of gene lists using DAVID
Data Mining in Ensembl with BioMart Nov,
GO-based tools for functional modeling TAMU GO Workshop 17 May 2010.
From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of.
Workshop Aims NMSU GO Workshop 20 May Aims of this Workshop  WIIFM? modeling examples background information about GO modeling  Strategies for.
Introduction to the GO: a user’s guide Iowa State Workshop 11 June 2009.
Getting Started: a user’s guide to the GO GO Workshop 3-6 August 2010.
The Public Face of TAIR User Interface Design Responsiveness to User Input.
1 / 61 Using the Customer Support Web Site © 2006, Universal Tax Systems, Inc. All Rights Reserved. Customer Support Site Objectives –In this chapter you.
Getting Started: a user’s guide to the GO TAMU GO Workshop 17 May 2010.
Compliance Assist Refresher Instruction Guide Adding or Editing Student Learning Outcomes.
Data Mining in Ensembl with BioMart Giulietta Spudich.
Go to your Blog URL: Then click on “Log in” Your students do not need to remember their password, they can select.
Introduction to the Gene Ontology GO Workshop 3-6 August 2010.
Input data for analysis Users that have expression values (dataset 1_ chicken affy_foldchane.txt. can upload that file as shown in slide 30.
ID Mapping to accessions from different databases. COST Functional Modeling Workshop April, Helsinki.
Introduction to the GO: a user’s guide NCSU GO Workshop 29 October 2009.
McGraw-Hill/Irwin The Interactive Computing Series © 2002 The McGraw-Hill Companies, Inc. All rights reserved. Microsoft Excel 2002 Working with Data Lists.
9/10/06 GO Users Meeting 2006 Seattle, Washington The AgBase GO Annotation Tools Susan Bridges 1,3, Fiona McCarthy 2,3, Nan Wang 1,3, G. Bryce Magee 1,3,
Copyright OpenHelix. No use or reproduction without express written consent1.
GO based data analysis Iowa State Workshop 11 June 2009.
Copyright OpenHelix. No use or reproduction without express written consent1.
Getting GO: how to get GO for functional modeling Iowa State Workshop 11 June 2009.
Welcome to Gramene’s RiceCyc (Pathways) Tutorial RiceCyc allows biochemical pathways to be analyzed and visualized. This tutorial has been developed for.
AgBase Shane Burgess, Fiona McCarthy Mississippi State University.
Tools in Bioinformatics Genome Browsers. Retrieving genomic information Previous lesson(s): annotation-based perspective of search/data Today: genomic-based.
Prioritization of Avian GO Annotation , , Chicken ,06949,5163.4Rat ,69664, Mouse ,83036, Human.
Welcome to the combined BLAST and Genome Browser Tutorial.
The Genome Genome Browser Training Materials developed by: Warren C. Lathe, Ph.D. and Mary Mangan, Ph.D. Part 2.
Welcome to the GrameneMart Tutorial A tool for batch data sequence retrieval 1.Select a Gramene dataset to search against. 2.Add filters to the dataset.
Getting GO annotation for your dataset
Strategies for functional modeling
To the ETS – Crown Mineral Activity Undisposed Crown Rights
Workshop Aims TAMU GO Workshop 17 May 2010.
Workshop Aims GO Workshop 3-6 August 2010.
Functional Annotation of the Horse Genome
Strategy for working on your own data sets.
ID Mapping tools: Converting Accessions between Databases
GO Annotation from different sources
Welcome! Crown Mineral Activity To the ETS – Crown Mineral Activity
Welcome to the GrameneMart Tutorial
Welcome - webinar instructions
Slide Set I: PARS Overview
Presentation transcript:

Managing Data Modeling GO Workshop 3-6 August 2010

Managing Data  Functional modeling strategy  Converting between Database IDs Ensembl Biomart UniProt DAVID AgBase ArrayIDer  Arrays  examples to work on

Types of data sets and modeling  Commercial array data – more likely to have ID mapping to support functional modeling.  Custom/USDA array data – may need to do your own ID mapping: see examples on workshop page.  Proteomics data  RNA-Seq data sets – computational pipelines to assign GO (GOanna is limited; contact AgBase).  Real-time data or quantitative proteomics data – hypothesis testing.

Protein/Gene identifiers GORetriever GO annotations Genes/Proteins with no GO annotations GOanna Pathways and network analysis GO Enrichment analysis ArrayIDer Microarray Ids GOSlimViewer Yellow boxes represent AgBase tools Green/Purple boxes are non-AgBase resources Ingenuity Pathways Analysis (IPA) Pathway Studio Cytoscape DAVID Ingenuity Pathways Analysis (IPA) Pathway Studio Cytoscape DAVID EasyGO/AgriGO Onto-Express Onto-Express-to-go (OE2GO) Overview of Functional Modeling Strategy summarizes GO function GOModeler hypothesis testing

Functional Modeling Considerations  Should I add my own GO? use GOSlimViewer to see how much GO is available for your species use GORetriever to see how much GO is available for your dataset  Should I do GO analysis and pathway analysis and network analysis? different functional modeling methods show different aspects about your data (complementary) is this type of data available for your species (or a close ortholog)?  What tools should I use? which tools have data for your species of interest? what type of accessions are accepted? availability (commercial and freely available)

 structurally and functionally re-annotated a microarray  quantified the impact of this re-annotation based on GO annotations & pathways represented on the array  tested using a previously published experiment that used this microarray  re-annotation allows more comprehensive GO based modeling and improves pathway coverage  re-annotation resulted in a different model from previously published research findings

Converting accessions  Depending on your data set & the tools you use, you are likely to need to convert between database accessions to do your functional modeling.  UniProt database – ID mapping tab  Ensembl BioMart  Online analysis tools: DAVID g:profiler GORetriever  ArrayIDer – converts EST accessions for some species (by request)

ID Mapping  Commercial arrays  Custom arrays  EST arrays  Proteomics  RNA-Seq data  Commercial ID mapping eg. NetAffy  Ensembl BioMart  Online tools (g:convert, DAVID)  ArrayIDer  UniProt ID Conversion

Working on your own data:  New to GO GO browser tutorials to familiarize yourself with the GO learn what GO is available for your species  Your own data set functional grouping to get overview (eg. GOSlimViewer GO enrichment analysis (tools available for your species)  Pathway analysis  Example data sets available – use as worked examples

Working on your own data:  New to GO GO browser tutorials to familiarize yourself with the GO learn what GO is available for your species  Your own data set functional grouping to get overview (eg. GOSlimViewer GO enrichment analysis (tools available for your species)  Pathway analysis  Example data sets available – use as worked examples Most of these tools (including Pathways Analysis) accept only certain database accessions  need to convert accessions between databases

Example: ID conversion  Ensembl Plant Biomart tool  currently limited species, but Ensembl is adding more plants  BioMart allows sophisticated querying of genomic data  DAVID ID conversion tool allows users to convert IDs and do GO enrichment analysis  UniProt ID conversion highly annotated data  ArrayIDer links ESTs to public database IDs

NOTE: Ensembl is adding new plant species…

1. Ensembl BioMart

Clicking on these headings allows you to set up searches. Selecting FILTERS gives you different filtering options:

Expand GENE and check “ID list limit” to select a defined list of accessions. Enter your list of accessions.

Selecting ATTRIBUTES allows you to choose what information is reported: Check accessions from external databases (UniProt & RefSeq).

 Clicking on RESULTS will show you the output information.  Output can be displayed online and/or downloaded (text, Excel).  Selecting FILTERS or ATTRIBUTES will allow you to go back and make changes.  Limited to species represented in Ensembl

2. Online analysis tools Database for Annotation, Visualization and Integrated Discovery (DAVID) This tool works for a wide range of species.

Paste in your accession list (You can also upload a file of accessions.)

Select accession type. NOTE: If you choose “Note Sure” the tool will try to decide what type of accession you have.

Select gene list. Submit list.

Select the type of accession you want to convert TO.

Any ambiguous IDs are listed for you to decide.

3. UniProt ID Mapping

Paste accession list (>1000 may cause errors). COMMENT: Note the difference between UniProt Accessions and UniProt IDs. UniProt accessions are a short string a letters and numerals 6-8 characters long. UniProt IDs have a suffix related to the species name. Eg: Cassava Hydroxynitrilase Accession: P52705 ID: HNL_MANES

Select the accession type you have: and the accession type you want to convert to: Click on MAP

The mapping link will display a tab separated file that can be displayed in Excel:

Contact AgBase to request additional species. 4. AgBase: ArrayIDer Maps ESTs to gene/protein accessions.

Upload a list of dbEST accessions or EST names.

An will be sent with a link to the results. Results are formatted as an Excel file.

For additional help with database accessions please contact AgBase.

Working on your own data: NOTE:  Always keep note of what tool you used to do the accession ID mapping/conversion and its version/update/date.  Keep a copy of your original IDs and what they mapped to so that you can refer back to this during your modeling.

Tutorial 1: ID conversion The AgriGO GO enrichment analysis tool accepts the following inputs for rice:  GenBank ID: AAP  DDBJ ID: BAB  EMBL ID: CAA  UniProt ID: Q9LYA9  RefSeq Peptide ID: NP_ We will convert a list of Rice Affy IDs to these IDs for use in the AgriGO tool.

Arrays: ID Mapping  “annotation” file that shows which database accessions the probes were based on  array annotation files may include multiple database IDs  Commercial arrays – may be updated regularly  Custom/Research arrays – not updated as often  Always check when the last ID mapping was updated, as this data changes continually

Array annotation available: FHCRC chicken 13K GPL2863 Agilent Chicken Gene Expression Microarray 4x44k GPL8764 Avian Innate Immunity Microarray (AIIM) GPL1461 Affymetrix Chicken Genome Array GPL3213* UIUC Bos taurus 13.2K 70-mer oligoarray GPL2853 Affymetrix Bovine Genome Array GPL2112 Agilent Bovine Oligo Microarray (4x44K) Equine Whole Genome Oligonucleotide (EWGO) array Array annotation in progress: ARK-Genomics G. gallus 20K v1.0 GPL5480 FHCRC Chicken 13K v2.0 GPL1836 Chicken cDNA DDMET 1700 array version 1.0 GPL3265

Tutorial 1: ID conversion Work through tutorial 1 on the workshop website. Alternatively – work on your own data set during this time, using the tutorial as a guide.