Visualization of genomic data Genome browsers
How many have used a genome browser ? UCSC browser ? Ensembl browser ? Others ? survey
Visualization of genomic data Genome browsers
Genome browsers Visualization of a gene >sequence ATGAAGTTATGGGATGTCGTGGCTGTCTGCCTGGTGCTGCTCCACACCGC GTCCGCCTTCCCGCTGCCCGCCGGTAAGAGGCCTCCCGAGGCGCCCGCCG AAGACCGCTCCCTCGGCCGCCGCCGCGCGCCCTTCGCGCTGAGCAGTGAC TGTAAGAACCGTTCCCTCCCCGCGGGGGGGCCGCCGGCGGACCCCCTCGC ACCCCCACCCGCAGCCAGCCCCGCACGTACCCCAAGCCAGCCTGATGGCT GTGTGGCCTACCGACCCGTGGGCAAGGGGTGCGGGTGCTGAAGCCCCCAG GGGTGCCTGGCTGCCCACTGCTGCCCGCACGCCTGGCCTGAAAGTGACAC GCGCTGGTTTGCCCAGCACAGAGGGGATGGAATTTTTATGCTGCTCCTTT AGCATTCTGATGAACAAATATCCTCCCCACCAGCACCACCACCTCAGAAA Chr Flat files / tab files
Genome browsers Why graphic Display ? Why is a graphic display better than Flat files / tab files A graphic display is compact Meta data available i.e. Support information about a gene Experimental evidence like EST Predicted gene structures SNP information Links to many databases In short much data about a gene is gathered is one place and can be viewed easily.
Genome browsers Visualization of a gene (Ensembl)
Genome browsers Visualization of a gene (UCSC) Exon Intron UTR
Genome browsers
UCSC genome browser Easy to use Often updates, but not as often as Ensembl upload of personal tracks Ensembl browser Less easy to use Maintained/updated by several people Gbrowser Genome browsers
UCSC genome browser Basic functionalities Finding a gene by name by sequence Gene structure Sequence orthologues Single Nucleotide Polymorphisms Gene Sorter - sort according to expression, homology... Custom tracks
BLAT genome Browser
BLAT genome Browser Using a search term or position eg Chr1:10,234-11,567
BLAT genome Browser
BLAT genome Browser Using a protein or DNA sequence
BLAT Blast Like Alignment Tool BLAT (2002) Very fast searches (MySQL database) Handle introns in RNA/DNA alignments Data for more that 30 genomes (human, mouse, rat…) Exon Intron Exon Splice sites
Blat genome Browser
BLAT genome Browser ”Details” Correct splice site ?
Logo Plot Information Content IC = -H(p) + log 2 (4) = a p a log 2 p a + 2 The Information content is calculated from a multiple sequence alignment. Result is a graphical visualization of sequence conservation where: Total height at a position is the Information Content Height of single letter is proportional to the frequency of that letter Mutiple alignment of 3 protein sequences: Seq1: A L R K P Q R T Seq2: A V R H I L L I Seq3: A I K V H N N T Pos1: I = -[1*log 2 (1)] = log 2 (20) = 4.32 Pos2: I = -[1/3*log 2 (1/3)+ 1/3*log 2 (1/3)+ 1/3*log 2 (1/3)] = 2.73 Pos3: I = -[2/3*log 2 (2/3)+ 1/3*log 2 (1/3) = 3.38
Logo Plot Exon
BLAT genome Browser ”Details” Correct splice site ?
BLAT genome Browser ”Details” Donor site | Acceptor site exon.... G | GT...intron...AG | exon...
Blat genome Browser
BLAT genome Browser ”Browser” Base, Center & Zoom Known genes Predictions RNA EST Conservation Expression
BLAT genome Browser Center & zoom
Forward/reverse direction Selected number of tracks
BLAT genome Browser Sequence Orthologs
“klick”
BLAT genome Browser Sequence Orthologs
SNPs
Chromosomal locus A locus is a physical location on a chromosome p the ‘short arm’ q the ‘long arm’ A locus range may describe a location of a gene 22q11.21-q q12.2 Chromosome22 Armq Band12 Sub-band2
Chromosomal locus Searching with gene name
Chromosomal locus Searching with locus range
Custom tracks Upload your personal data Share data with colleagues Data need to be related to a reference organism
Custom tracks
Exercise 1.Basic understanding of the graphics 2.Effect of Single Nucleotide Polymorphisms (SNPs) 3.Finding Orthologue genes 4.Identify chromosomal locus for a gene