Presentation is loading. Please wait.

Presentation is loading. Please wait.

Working with gene lists: Finding data using GEO & BioMart June 5, 2014.

Similar presentations


Presentation on theme: "Working with gene lists: Finding data using GEO & BioMart June 5, 2014."— Presentation transcript:

1 Working with gene lists: Finding data using GEO & BioMart June 5, 2014

2 Analyzing a gene list  With hundreds of genes but a limited budget and lab personnel, you need to prioritize the gene list to candidate genes for follow-up  Pick ones that are “interesting”  Known to be involved in other related processes but not (yet) in your process of interest  Has protein features which suggest a function in your process, but it has not been characterized  No known function or domain, but it shows up in other, related high-throughput experiments suggesting a key role in your process of interest

3 Our approach Analyzing gene lists by: 1. Finding overlap with other high-throughput experiments 2. Finding additional information using BioMart 1. Mouse/human homologs 2. Protein domain content 3. GO classification

4 GEO (gene expression omnibus)  GEO Datasets  Curated gene expression datasets i.e. there is backlog of experiments that haven’t made it into the database  Can search for experiments and conduct differential gene expression queries on some datasets  Can download datasets & do offline analyses  GEO Profiles  Profiles of expression data for genes

5 Why search GEO?  What other experiments have been done that are similar to yours?  GEO datasets  How do my genes of interest behave in other large scale experiments  GEO profiles

6 GEO Profile search Search on a gene name (C04F5.7):

7

8

9 GEO Dataset search “C. elegans”: 4434

10 GEO Dataset searches QueryTotal datasets C. elegans datasets C. elegans44344072 C. elegans AND response131121 C. elegans AND host response55 C. elegans AND immune2420 C. elegans AND antimicrobial10994

11 Once dataset identified  Download data  SOFT format: tab-delimited data  Issues:  Not necessarily processed such that they have the ratios of experiment/control  If starting with raw data, may not be able to replicate exactly what authors did or lack expertise/software to generate a list of DE genes  Look for supplementary data from publication  Usually they provide a list of all DE genes

12 Choice of dataset for comparison In class demo

13 Biomart – EBI Ensembl  Use series of menus Data source – organism (genes, variation, ect) Filters -- reduce the number of results Attributes – what data to return  Can set up very precise and multilayered queries  Can query across multiple organisms  Simple query:  Given a list of gene IDs, you can obtain attributes or sequences for the entire list  Tools  ID converter – very useful, easy to use

14 Two sites for BioMart access www.biomart.org

15 Database journal issue on BioMart

16

17 Filtering in BioMart

18 Attributes in BioMart

19 Biomart  Filters  C. elegans genes with a human homolog  Specify only genes with >= # isoforms  protein coding genes with a transmembrane domain  Attributes  Entrez Gene IDs, WormBase IDs, Affy IDs  Sequence data transcript, protein, UTRs, flanking regions, ect.

20 BioMart  In class demo

21 Today’s exercise  Compare current dataset from PLoS Pathogens paper to data from a different dataset  Identify & retrieve additional information about C. elegans genes using BioMart


Download ppt "Working with gene lists: Finding data using GEO & BioMart June 5, 2014."

Similar presentations


Ads by Google