Visualization of genomic data Genome browsers. UCSC browser Ensembl browser Others ? Survey.

Slides:



Advertisements
Similar presentations
© Wiley Publishing All Rights Reserved. Using Nucleotide Sequence Databases.
Advertisements

Ab initio gene prediction Genome 559, Winter 2011.
Peter Tsai, Bioinformatics Institute.  University of California, Santa Cruz (UCSC)  A rapid and reliable display of any requested portion of genomes.
1 Computational Molecular Biology MPI for Molecular Genetics DNA sequence analysis Gene prediction Gene prediction methods Gene indices Mapping cDNA on.
Finding Eukaryotic Open reading frames.
Visualization of genomic data Genome browsers. UCSC browser Ensembl browser Others ? Survey.
Tutorial 7 Genome browser. Free, open source, on-line broswer for genomes Contains ~100 genomes, from nematodes to human. Many tools that can be used.
Predicting the Function of Single Nucleotide Polymorphisms Corey Harada Advisor: Eleazar Eskin.
Copyright OpenHelix. No use or reproduction without express written consent1 Organization of genomic data… Genome backbone: base position number sequence.
Visualization of genomic data Genome browsers. How many have used a genome browser ? UCSC browser ? Ensembl browser ? Others ? survey.
UCSC Genome Browser Tutorial
Genome Browsers Ensembl (EBI, UK) and UCSC (Santa Cruz, California)
Gene Discovery & Genome Browsing
How to access genomic information using Ensembl August 2005.
Genome Browsing with the UCSC Genome Browser
It & Health 2010 Summary Thomas Nordahl Petersen.
Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to.
Visualization of genomic data Genome browsers. UCSC browser Ensembl browser Others ? Survey.
PolyPhen and SIFT: Tools for predicting functional effects of SNPs Epi 244 Spring 2009 Sam S. Oh.
Chapter 6 Gene Prediction: Finding Genes in the Human Genome.
The Genome Genome Browser Training Materials developed by: Warren C. Lathe, Ph.D. and Mary Mangan, Ph.D. Part 1.
Genome Annotation BBSI July 14, 2005 Rita Shiang.
The UCSC Genome Browser Introduction
UCSC Genome Browser 1. The Progress 2 Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools.
1. Bacterial genomes - genes tightly packed, no introns... HOW TO FIND GENES WITHIN A DNA SEQUENCE? Scan for ORFs (open reading frames) - check all 6 reading.
Copyright OpenHelix. No use or reproduction without express written consent1.
Common Errors in Student Annotation Submissions contributions from Paul Lee, David Xiong, Thomas Quisenberry Annotating multiple genes at the same locus.
ANALYSIS AND VISUALIZATION OF SINGLE COPY ORTHOLOGS IN ARABIDOPSIS, LETTUCE, SUNFLOWER AND OTHER PLANT SPECIES. Alexander Kozik and Richard W. Michelmore.
Browsing the Genome Using Genome Browsers to Visualize and Mine Data.
Outline 1.What is an amino acid / protein naturally occurring amino acids 3.Codon – triplet coding for an amino acid 1.How are proteins synthesized.
Pattern Matching Rhys Price Jones Anne R. Haake. What is pattern matching? Pattern matching is the procedure of scanning a nucleic acid or protein sequence.
Sackler Medical School
Outline What is an amino acid / protein
Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell.
A Non-EST-Based Method for Exon-Skipping Prediction Rotem Sorek, Ronen Shemesh, Yuval Cohen, Ortal Basechess, Gil Ast and Ron Shamir Genome Research August.
Genes and Genomes. Genome On Line Database (GOLD) 243 Published complete genomes 536 Prokaryotic ongoing genomes 434 Eukaryotic ongoing genomes December.
How can we find genes? Search for them Look them up.
Annotation of Drosophila virilis Chris Shaffer GEP workshop, 2006.
Bioinformatics Workshops 1 & 2 1. use of public database/search sites - range of data and access methods - interpretation of search results - understanding.
Lesson Four Structure of a Gene. Gene Structure What is a gene? Gene: a unit of DNA on a chromosome that codes for a protein(s) –Exons –Introns –Promoter.
UCSC Genome Browser Zeevik Melamed & Dror Hollander Gil Ast Lab Sackler Medical School.
Tools in Bioinformatics Genome Browsers. Retrieving genomic information Previous lesson(s): annotation-based perspective of search/data Today: genomic-based.
Accessing and visualizing genomics data
Genomes at NCBI. Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools lists 57 databases.
Welcome to the combined BLAST and Genome Browser Tutorial.
Primer on Reading Frames and Phase Wilson Leung08/2012.
Visualization of genomic data Genome browsers. How many have used a genome browser ? UCSC browser ? Ensembl browser ? Others ? survey.
Introduction to Bioinformatics Summary Thomas Nordahl Petersen.
Introduction to molecular biology Data Mining Techniques.
Using public resources to understand associations Dr Luke Jostins Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015.
Genetic Code and Interrupted Gene Chapter 4. Genetic Code and Interrupted Gene Aala A. Abulfaraj.
Web Databases for Drosophila
Annotation for D. virilis
Lesson Four Structure of a Gene.
Lesson Four Structure of a Gene.
2/23/15 Learning Objectives
GEP Annotation Workflow
Visualization of genomic data
Eukaryotic Gene Finding
Visualization of genomic data
Ab initio gene prediction
Mutations changes in the DNA sequence that can be inherited
Ensembl Genome Repository.
BLAT Blast Like Alignment Tool
Standard Mutation Nomenclature in Molecular Diagnostics
Problems from last section
Introduction to Alternative Splicing and my research report
Part II SeqViewer AraCyc Help
Determine CDS Coordinates
Common Errors in Student Annotation Submissions contributions from Paul Lee, David Xiong, Thomas Quisenberry Annotating multiple genes at the same locus.
Presentation transcript:

Visualization of genomic data Genome browsers

UCSC browser Ensembl browser Others ? Survey

UCSC genome browser Basic functionalities used in exercise Finding a gene by name by sequence Gene structure Orthologues – i.e. functional homolog in other organisms SNP’s - Single Nucleotide Polymorphisms Several other functionalities Gene Sorter - sort according to expression, homology, in situ images of genes in different tissues Custom tracks – upload your own data

Visualization of genomic data Genome browsers

Genome browsers Visualization of a gene >chr5: ATGAAGTTATGGGATGTCGTGGCTGTCTGCCTGGTGCTGCTCCACACCGC GTCCGCCTTCCCGCTGCCCGCCGGTAAGAGGCCTCCCGAGGCGCCCGCCG AAGACCGCTCCCTCGGCCGCCGCCGCGCGCCCTTCGCGCTGAGCAGTGAC TGTAAGAACCGTTCCCTCCCCGCGGGGGGGCCGCCGGCGGACCCCCTCGC ACCCCCACCCGCAGCCAGCCCCGCACGTACCCCAAGCCAGCCTGATGGCT GTGTGGCCTACCGACCCGTGGGCAAGGGGTGCGGGTGCTGAAGCCCCCAG GGGTGCCTGGCTGCCCACTGCTGCCCGCACGCCTGGCCTGAAAGTGACAC GCGCTGGTTTGCCCAGCACAGAGGGGATGGAATTTTTATGCTGCTCCTTT AGCATTCTGATGAACAAATATCCTCCCCACCAGCACCACCACCTCAGTAA Chr Open Reading Frame (ORF) – from start to stop codon Flat files / tab files Exon Intron

Genome browsers Why graphic Display ? Why is a graphic display better than Flat files / tab files A graphic display is compact Meta data available i.e. Support information about a gene Experimental evidence like EST Predicted gene structures SNP information Links to many databases In short much data about a gene is gathered is one place and can be viewed easily.

Genome browsers Visualization of a gene (Ensembl)

Genome browsers Visualization of a gene (UCSC) Exon Intron UTR

UCSC genome browser Easy to use Often updates, but not as often as Ensembl upload of personal tracks Ensembl browser Less easy to use Maintained/updated by several people Gbrowser Genome browsers

BLAT Blast Like Alignment Tool BLAT (2002) Very fast searches (MySQL database) Handle introns in RNA/DNA alignments Check that donor/acceptor rules are followed Data for more that 30 genomes (human, mouse, rat…) Exon Intron Exon Splice sites Donor site Acceptor site GTAG

BLAT genome Browser

BLAT genome Browser Using a search term or position eg Chr1:10,234-11,567

BLAT genome Browser

BLAT genome Browser Using a protein or DNA sequence

Blat genome Browser

BLAT genome Browser ”Details” Correct splice site ?

Logo Plot Information Content IC = -H(p) + log 2 (4) =  a p a log 2 p a + 2 The Information content is calculated from a multiple sequence alignment. Result is a graphical visualization of sequence conservation where: Total height at a position is the Information Content Height of single letter is proportional to the frequency of that letter Mutiple alignment of 3 protein sequences: Seq1: A L R K P Q R T Seq2: A V R H I L L I Seq3: A I K V H N N T Pos1: I = [1*log 2 (1)] = log 2 (20) = 4.32 Pos2: I = [1/3*log 2 (1/3)+ 1/3*log 2 (1/3)+ 1/3*log 2 (1/3)] = 2.73 Pos3: I = [2/3*log 2 (2/3)+ 1/3*log 2 (1/3) = 3.38

Logo Plot Exon

BLAT genome Browser ”Details” Correct splice site ?

BLAT genome Browser ”Details” Donor site | Acceptor site exon.... G | GT...intron...AG | exon...

Blat genome Browser

BLAT genome Browser ”Browser” Base, Center & Zoom Known genes Predictions RNA EST Conservation Expression

Genome browsers

BLAT genome Browser Center & zoom

Forward/reverse direction Selected number of tracks

BLAT genome Browser Sequence Orthologs

“klick”

BLAT genome Browser Sequence Orthologs

SNPs

Single Nucleotide Polymorphism SNP SNPs can be located anywere in the genome non synomous (nsSNP) i.e. amino acid is changed (shown below ) Synomous SNP does not affect the the protein An amino acid is coded by 3 nucleotides Valine (V): GTC V I T P Humans are diploid: cells have 2 homologous copies of each chromosome i.e. 2*23 chromosomes. Haploid cells only 23 chromosomes (sex-cells)

Diploid organism - most mammals A chromosome from mother If the red strand is the plus-strand: C;T (or T;C but we write it alphabetical) If the green strand is the minus strand: G;A but we write it as G;A A chromosome from father An example of two homologous copies of ex chromosome 9 within a cell

SNP nomenclature SNPs within a coding region of a piece of DNA might cause a change in the translated protein ie. SNPs within an exon region. Also, SNPs at the boundary of intron/exon regions can have an effect on the protein product. nsSNP (non-synonymous SNP) cSNP (coding SNP) missense SNPs or mutations: nsSNP and cSNP. nonsense SNPS are those that result in a stop-codon SNPs within an exon region that do NOT change the protein product sSNP (synonymous SNP) ATG 5’

SNPs

Exercise 1.Basic understanding of the graphics 2.Effect of Single Nucleotide Polymorphisms (SNPs) 3.Finding Orthologue genes 4.Identify chromosomal locus for a gene