Visualization of genomic data 16-Nov-18 Genome browsers Visualization of genomic data
16-Nov-18 16-Nov-18 Survey UCSC browser Ensembl browser Others ? 2
UCSC genome browser Basic functionalities used in exercise 16-Nov-18 16-Nov-18 UCSC genome browser Basic functionalities used in exercise Finding a gene by name by sequence Gene structure Orthologues – i.e. functional homolog in other organisms SNP’s - Single Nucleotide Polymorphisms Several other functionalities Gene Sorter - sort according to expression, homology, in situ images of genes in different tissues Custom tracks – upload your own data 3
Visualization of genomic data 16-Nov-18 Genome browsers Visualization of genomic data
Genome browsers Visualization of a gene 16-Nov-18 Genome browsers Visualization of a gene Flat files / tab files >chr5:123.004.678-125.345.112 ATGAAGTTATGGGATGTCGTGGCTGTCTGCCTGGTGCTGCTCCACACCGC GTCCGCCTTCCCGCTGCCCGCCGGTAAGAGGCCTCCCGAGGCGCCCGCCG AAGACCGCTCCCTCGGCCGCCGCCGCGCGCCCTTCGCGCTGAGCAGTGAC TGTAAGAACCGTTCCCTCCCCGCGGGGGGGCCGCCGGCGGACCCCCTCGC ACCCCCACCCGCAGCCAGCCCCGCACGTACCCCAAGCCAGCCTGATGGCT GTGTGGCCTACCGACCCGTGGGCAAGGGGTGCGGGTGCTGAAGCCCCCAG GGGTGCCTGGCTGCCCACTGCTGCCCGCACGCCTGGCCTGAAAGTGACAC GCGCTGGTTTGCCCAGCACAGAGGGGATGGAATTTTTATGCTGCTCCTTT AGCATTCTGATGAACAAATATCCTCCCCACCAGCACCACCACCTCAGTAA Exon Intron Exon Chr5 123.004.678 123.404.678 124.987.012 125.345.112 Open Reading Frame (ORF) – from start to stop codon
Genome browsers Why graphic Display ? 16-Nov-18 Genome browsers Why graphic Display ? Why is a graphic display better than Flat files / tab files A graphic display is compact Meta data available i.e. Support information about a gene Experimental evidence like EST Predicted gene structures SNP information Links to many databases In short much data about a gene is gathered is one place and can be viewed easily.
Genome browsers Visualization of a gene (Ensembl) 16-Nov-18 Genome browsers Visualization of a gene (Ensembl)
Genome browsers Visualization of a gene (UCSC) 16-Nov-18 Genome browsers Visualization of a gene (UCSC) Exon Intron UTR
Genome browsers UCSC genome browser http://genome.ucsc.edu/ 16-Nov-18 Genome browsers UCSC genome browser http://genome.ucsc.edu/ Easy to use Often updates, but not as often as Ensembl upload of personal tracks Ensembl browser http://www.ensembl.org/index.html Less easy to use Maintained/updated by several people Gbrowser http://www.gmod.org/GBrowse
BLAT Blast Like Alignment Tool 16-Nov-18 BLAT Blast Like Alignment Tool BLAT (2002) Very fast searches (MySQL database) Handle introns in RNA/DNA alignments Check that donor/acceptor rules are followed Data for more that 30 genomes (human, mouse, rat…) Exon Intron Exon Splice sites Donor site Acceptor site GT AG
16-Nov-18 BLAT genome Browser http://genome.ucsc.edu//
16-Nov-18 BLAT genome Browser Using a search term or position eg Chr1:10,234-11,567
16-Nov-18 BLAT genome Browser http://genome.ucsc.edu/
16-Nov-18 BLAT genome Browser Using a protein or DNA sequence
16-Nov-18 Blat genome Browser
BLAT genome Browser ”Details” 16-Nov-18 BLAT genome Browser ”Details” Correct splice site ?
Logo Plot Information Content 16-Nov-18 IC = -H(p) + log2(4) = a palog2pa + 2 The Information content is calculated from a multiple sequence alignment. Result is a graphical visualization of sequence conservation where: Total height at a position is the Information Content Height of single letter is proportional to the frequency of that letter Mutiple alignment of 3 protein sequences: Seq1: A L R K P Q R T Seq2: A V R H I L L I Seq3: A I K V H N N T Pos1: I = [1*log2(1)]+ 4.32 = log2(20) = 4.32 Pos2: I = [1/3*log2(1/3)+ 1/3*log2(1/3)+ 1/3*log2(1/3)] + 4.32 = 2.73 Pos3: I = [2/3*log2(2/3)+ 1/3*log2(1/3) + 4.32 = 3.38
16-Nov-18 Logo Plot Exon
BLAT genome Browser ”Details” 16-Nov-18 BLAT genome Browser ”Details” Correct splice site ?
BLAT genome Browser ”Details” 16-Nov-18 BLAT genome Browser ”Details” Donor site | Acceptor site exon... . G | GT ...intron ...AG | exon...
16-Nov-18 Blat genome Browser
BLAT genome Browser ”Browser” 16-Nov-18 BLAT genome Browser ”Browser” Base, Center & Zoom Known genes Predictions RNA EST Expression Conservation
16-Nov-18 Genome browsers
16-Nov-18 Genome browsers
BLAT genome Browser Center & zoom 16-Nov-18 BLAT genome Browser Center & zoom
BLAT genome Browser Center & zoom 16-Nov-18 BLAT genome Browser Center & zoom Selected number of tracks Forward/reverse direction
BLAT genome Browser Sequence Orthologs 16-Nov-18 BLAT genome Browser Sequence Orthologs
BLAT genome Browser Sequence Orthologs 16-Nov-18 BLAT genome Browser Sequence Orthologs “klick”
BLAT genome Browser Sequence Orthologs 16-Nov-18 BLAT genome Browser Sequence Orthologs
BLAT genome Browser Sequence Orthologs 16-Nov-18 BLAT genome Browser Sequence Orthologs
BLAT genome Browser Sequence Orthologs 16-Nov-18 BLAT genome Browser Sequence Orthologs
16-Nov-18 SNPs
Single Nucleotide Polymorphism SNP SNPs can be located anywere in the genome non synomous (nsSNP) i.e. amino acid is changed (shown below ) Synomous SNP does not affect the the protein V I T P An amino acid is coded by 3 nucleotides Valine (V): GTC Humans are diploid: cells have 2 homologous copies of each chromosome i.e. 2*23 chromosomes. Haploid cells only 23 chromosomes (sex-cells)
Diploid organism - most mammals An example of two homologous copies of ex chromosome 9 within a cell A chromosome from mother A chromosome from father If the red strand is the plus-strand: C;T (or T;C but we write it alphabetical) If the green strand is the minus strand: G;A but we write it as G;A
16-Nov-18 SNPs
16-Nov-18 SNPs
Exercise Basic understanding of the graphics 16-Nov-18 Exercise Basic understanding of the graphics Effect of Single Nucleotide Polymorphisms (SNPs) Finding Orthologue genes Identify chromosomal locus for a gene