Download presentation
Presentation is loading. Please wait.
Published byJessie Morris Modified over 9 years ago
1
Andy Conley 3/26/2012 1
2
James Kent. Know that name. He is one of greatest, perhaps the greatest, bioinformatics programmers ever. He was deeply involved in the assembly of the public human genome project. If you were in the fall class, you compiled the James Kent Source tree. Almost all his. He speaks nothing but the truth. 2
3
“Genome browsers facilitate genomic analysis by presenting alignment, experimental and annotation data in the context of genomic DNA sequences.” Melissa S Cline & James W Kent, 2009 Genome browsers aggregate data 3
4
4 Clicking on any of these takes you to a page full of details CDKN2ACDKN2A
5
They are any kind of genomic information Genes Transposable element insertions Transcription factor binding sites Sites prone to recombination Conservation of genomics sequences Extremely important in modern times are tracks displaying ChIP-seq or RNA-seq data 5
6
Arguably the most advanced genome browser, it is much more than a tool for looking at genomes It integrates a huge amount of data for each gene it displays. The UCSC also has a graphical front end for downloading from its huge backend database 6
7
It hosts the ENCODE project, one of the largest, probably the largest, assemblies of functional genomic data. It let’s you jump between orthologous regions in different genome: CDKN2ACDKN2A It’s a massive, massive database backend of over 6500 tables. 7
8
It’s really, really, really hard to install. It’s impossible to understand unless you’ve tried to do it. The UCSC genome browser works so well for the genomes that it has because it is so very, very specialized for those genomes. Each track in the UCSC browser has been lovingly crafted. 8
9
A ridiculous number of genomes They’re going to be coming out even faster in the next year or two, then faster after that. Things like the new PacBio providing longer reads should make assembling eukaryotic genomes easier. 9
10
You can’t load them/annotate them by hand – it all has to be automated. The UCSC guys do it for the human genome because it’s the human genome. They’re all different from each other. You have to have some easily deployable storage/display method for your data. 10
11
There are a number of choices out there for a genome browser There are really just 2 big ones: UCSC GMOD & GBrowse We already discussed why you don’t use the UCSC browser for projects 11
12
Generic – It can handle any organism Model Organism – Not really, whatever genome Database – Not really a database, but there is a database in it. GMOD just sounds good gmod.org 12
13
A simple, easily deployable method for storing, viewing and editing genomic data. GMOD has many, many parts Some of the big ones: Apollo – Eww Chado – A mechanism for storing genomic data GBrowse – A genome browser 13
14
Probably (definitely) the most commonly used of the GMOD components It is a simple but extensible platform for displaying genomic data It is maintained mostly by this man: Scott Cain 14
15
Many projects use GBrowse as their genome viewer 15
16
WormBase WormBase is to the C.elegans genome what the UCSC browser is to the human and mouse genomes. It is huge. 16
17
FlyBase hosts many Drosophila genomes, though not with the depth of WormBase WormBase is really at the top of non-UCSC browsers in it’s depth of information This makes sense, given that nematodes are so heavily studied and very easy to work with. 17
18
The result of the first couple years of the class Currently maintained by Lee Katz at the CDC 18
19
19
20
20 Darker genes had more programs that indicated them being horizontally transferred This shows genes that we thought were horizontally transferred
21
We had a track of virulence factors in the first year Clicking on any of them took you to details for the gene, a link to VFDB, etc. 21
22
You can alter how tracks are show in other ways Add and remove tracks, change the link that appears over a feature in the genome. 22
23
One big, important thing: “Genome browsers facilitate genomic analysis by presenting alignment, experimental and annotation data in the context of genomic DNA sequences.” Melissa S Cline & James W Kent, 2009 Genome browsers, in short, aggregate data. 23
24
My rotifertranscriptome browser. It doesn’t have to be a genome Not super exciting from this view. Just the predicted coding region of an assembled contig (mRNA) 24
25
25
26
The relative ordering of things in a genome. Just a few years ago, this was not available in GBrowse, it is now. This could easily work for comparing different bacterial species 26
27
27
28
28
29
Are genome browsers useful? 29
30
We deal with huge volumes of data The fall class will recall my hatred of GUIs We want high-throughput Genome browses give you none of this. None. 30
31
I spent quite a bit of time in undergrad doing bench work for Dr. Nils Kroger across the street. I worked with these little guys: Fascinating creatures I cared about three genes: Sil1, Sil2, Sil3 They day the genome browser came out changed the game 31
32
Still pretty useful My main uses: 1. Make sure my data are correct. Are my intersections between genes and transposable element insertions correct? 2. Download hosted data. 3. Make nice pictures 4. Like a biologist, gene information about specific genes 32
33
How useful is it really? It really depends on who you ask It’s really for biologists: they find the browser, search for their favorite gene and get some details about it. Once again, data aggregation. 33
34
They were super excited about it They use it all the time It is like magic to them. If you were to show an iPhone to somebody from 1975, it would be pretty much the same thing. Almost. 34
35
Will it ever be the greatest genome browser? No. That will always be the UCSC browser Will it remain the easiest to install for some time? Probably Will you get the best return on time spent Yep Synteny is horribly conserved in Haemophilus, so avoid Gbrowse_syn for this class, but do keep it in mind. 35
36
Genome browsers: Allow navigation of the genome Show genomic features, whatever they are Show annotations Show comparisons 36
37
GBrowse, and all of GMOD, use GFF files Generic Feature Format Most of it is pretty simple. Chromosome(contig) start, stop, strand, id The last column is what’s important. It lets you put whatever information about the feature you want in there. It’s a very flexible format. 37
38
Thanks for listening 38
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.