Visualization of genomic data

Slides:



Advertisements
Similar presentations
It og Sundhed Nov Jan. Thomas Nordahl Petersen, Associate Professor Center for Biological Sequence Analysis, DTU Normal
Advertisements

Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes.
It og Sundhed Nov Jan. Thomas Nordahl Petersen, Associate Professor Center for Biological Sequence Analysis, DTU
Peter Tsai, Bioinformatics Institute.  University of California, Santa Cruz (UCSC)  A rapid and reliable display of any requested portion of genomes.
Finding Eukaryotic Open reading frames.
Visualization of genomic data Genome browsers. UCSC browser Ensembl browser Others ? Survey.
Tutorial 7 Genome browser. Free, open source, on-line broswer for genomes Contains ~100 genomes, from nematodes to human. Many tools that can be used.
Predicting the Function of Single Nucleotide Polymorphisms Corey Harada Advisor: Eleazar Eskin.
Copyright OpenHelix. No use or reproduction without express written consent1 Organization of genomic data… Genome backbone: base position number sequence.
Visualization of genomic data Genome browsers. How many have used a genome browser ? UCSC browser ? Ensembl browser ? Others ? survey.
UCSC Genome Browser Tutorial
It og Sundhed Thomas Nordahl Petersen, Associate Professor Center for Biological Sequence Analysis, DTU
Genomic Database - Ensembl Ka-Lok Ng Department of Bioinformatics Asia University.
Gene Discovery & Genome Browsing
How to access genomic information using Ensembl August 2005.
Genome Browsing with the UCSC Genome Browser
It & Health 2010 Summary Thomas Nordahl Petersen.
Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to.
Visualization of genomic data Genome browsers. UCSC browser Ensembl browser Others ? Survey.
Lecture 12 Splicing and gene prediction in eukaryotes
Why Manual Genome Annotation? Even the best gene predictors and genome annotation pipelines rarely exceed accuracies of 80% at the exon level, meaning.
Chapter 6 Gene Prediction: Finding Genes in the Human Genome.
The Genome Genome Browser Training Materials developed by: Warren C. Lathe, Ph.D. and Mary Mangan, Ph.D. Part 1.
What is comparative genomics? Analyzing & comparing genetic material from different species to study evolution, gene function, and inherited disease Understand.
UCSC Genome Browser 1. The Progress 2 Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools.
Copyright OpenHelix. No use or reproduction without express written consent1.
Common Errors in Student Annotation Submissions contributions from Paul Lee, David Xiong, Thomas Quisenberry Annotating multiple genes at the same locus.
ANALYSIS AND VISUALIZATION OF SINGLE COPY ORTHOLOGS IN ARABIDOPSIS, LETTUCE, SUNFLOWER AND OTHER PLANT SPECIES. Alexander Kozik and Richard W. Michelmore.
Browsing the Genome Using Genome Browsers to Visualize and Mine Data.
Outline 1.What is an amino acid / protein naturally occurring amino acids 3.Codon – triplet coding for an amino acid 1.How are proteins synthesized.
Sackler Medical School
Outline What is an amino acid / protein
Bioinformatics and Computational Biology
A Non-EST-Based Method for Exon-Skipping Prediction Rotem Sorek, Ronen Shemesh, Yuval Cohen, Ortal Basechess, Gil Ast and Ron Shamir Genome Research August.
Genes and Genomes. Genome On Line Database (GOLD) 243 Published complete genomes 536 Prokaryotic ongoing genomes 434 Eukaryotic ongoing genomes December.
How can we find genes? Search for them Look them up.
Annotation of Drosophila virilis Chris Shaffer GEP workshop, 2006.
Bioinformatics Workshops 1 & 2 1. use of public database/search sites - range of data and access methods - interpretation of search results - understanding.
UCSC Genome Browser Zeevik Melamed & Dror Hollander Gil Ast Lab Sackler Medical School.
Tools in Bioinformatics Genome Browsers. Retrieving genomic information Previous lesson(s): annotation-based perspective of search/data Today: genomic-based.
Accessing and visualizing genomics data
Genomes at NCBI. Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools lists 57 databases.
Welcome to the combined BLAST and Genome Browser Tutorial.
Primer on Reading Frames and Phase Wilson Leung08/2012.
Visualization of genomic data Genome browsers. How many have used a genome browser ? UCSC browser ? Ensembl browser ? Others ? survey.
Visualization of genomic data Genome browsers. UCSC browser Ensembl browser Others ? Survey.
Introduction to Bioinformatics Summary Thomas Nordahl Petersen.
Using public resources to understand associations Dr Luke Jostins Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015.
Web Databases for Drosophila
Annotation for D. virilis
Lesson Four Structure of a Gene.
Lesson Four Structure of a Gene.
Genomics and Personalized Care in Health Systems Lecture 7 Gene Finding (Part 2) Ab initio and Evidence-Based Gene Finding Leming Zhou, PhD School of.
GEP Annotation Workflow
Visualization of genomic data
Ab initio gene prediction
Genome Editing with Apollo
Ensembl Genome Repository.
Entropy, Information contents & Logo plots By Thomas Nordahl Petersen
Mapping Whole-Transcriptome Splicing in Mouse Hematopoietic Stem Cells
BLAT Blast Like Alignment Tool
It og Sundhed Thomas Nordahl Petersen, Associate Professor
Standard Mutation Nomenclature in Molecular Diagnostics
Gene Safari (Biological Databases)
Problems from last section
Introduction to Alternative Splicing and my research report
Part II SeqViewer AraCyc Help
Determine CDS Coordinates
The Toy Exon Finder.
Common Errors in Student Annotation Submissions contributions from Paul Lee, David Xiong, Thomas Quisenberry Annotating multiple genes at the same locus.
Presentation transcript:

Visualization of genomic data 16-Nov-18 Genome browsers  Visualization of genomic data

16-Nov-18 16-Nov-18 Survey  UCSC browser Ensembl browser Others ? 2

UCSC genome browser Basic functionalities used in exercise 16-Nov-18 16-Nov-18 UCSC genome browser Basic functionalities used in exercise  Finding a gene by name by sequence Gene structure Orthologues – i.e. functional homolog in other organisms SNP’s - Single Nucleotide Polymorphisms Several other functionalities Gene Sorter - sort according to expression, homology, in situ images of genes in different tissues Custom tracks – upload your own data 3

Visualization of genomic data 16-Nov-18 Genome browsers  Visualization of genomic data

Genome browsers Visualization of a gene 16-Nov-18 Genome browsers Visualization of a gene  Flat files / tab files >chr5:123.004.678-125.345.112 ATGAAGTTATGGGATGTCGTGGCTGTCTGCCTGGTGCTGCTCCACACCGC GTCCGCCTTCCCGCTGCCCGCCGGTAAGAGGCCTCCCGAGGCGCCCGCCG AAGACCGCTCCCTCGGCCGCCGCCGCGCGCCCTTCGCGCTGAGCAGTGAC TGTAAGAACCGTTCCCTCCCCGCGGGGGGGCCGCCGGCGGACCCCCTCGC ACCCCCACCCGCAGCCAGCCCCGCACGTACCCCAAGCCAGCCTGATGGCT GTGTGGCCTACCGACCCGTGGGCAAGGGGTGCGGGTGCTGAAGCCCCCAG GGGTGCCTGGCTGCCCACTGCTGCCCGCACGCCTGGCCTGAAAGTGACAC GCGCTGGTTTGCCCAGCACAGAGGGGATGGAATTTTTATGCTGCTCCTTT AGCATTCTGATGAACAAATATCCTCCCCACCAGCACCACCACCTCAGTAA Exon Intron Exon Chr5 123.004.678 123.404.678 124.987.012 125.345.112 Open Reading Frame (ORF) – from start to stop codon

Genome browsers Why graphic Display ? 16-Nov-18 Genome browsers Why graphic Display ?  Why is a graphic display better than Flat files / tab files A graphic display is compact Meta data available i.e. Support information about a gene Experimental evidence like EST Predicted gene structures SNP information Links to many databases In short much data about a gene is gathered is one place and can be viewed easily.

Genome browsers Visualization of a gene (Ensembl) 16-Nov-18 Genome browsers Visualization of a gene (Ensembl) 

Genome browsers Visualization of a gene (UCSC) 16-Nov-18 Genome browsers Visualization of a gene (UCSC)  Exon Intron UTR

Genome browsers  UCSC genome browser http://genome.ucsc.edu/ 16-Nov-18 Genome browsers  UCSC genome browser http://genome.ucsc.edu/ Easy to use Often updates, but not as often as Ensembl upload of personal tracks Ensembl browser http://www.ensembl.org/index.html Less easy to use Maintained/updated by several people Gbrowser http://www.gmod.org/GBrowse

BLAT Blast Like Alignment Tool 16-Nov-18 BLAT Blast Like Alignment Tool BLAT (2002) Very fast searches (MySQL database) Handle introns in RNA/DNA alignments Check that donor/acceptor rules are followed Data for more that 30 genomes (human, mouse, rat…) Exon Intron Exon Splice sites Donor site Acceptor site GT AG

16-Nov-18 BLAT genome Browser http://genome.ucsc.edu//

16-Nov-18 BLAT genome Browser Using a search term or position eg Chr1:10,234-11,567

16-Nov-18 BLAT genome Browser http://genome.ucsc.edu/

16-Nov-18 BLAT genome Browser Using a protein or DNA sequence

16-Nov-18 Blat genome Browser

BLAT genome Browser ”Details” 16-Nov-18 BLAT genome Browser ”Details” Correct splice site ?

Logo Plot Information Content 16-Nov-18 IC = -H(p) + log2(4) = a palog2pa + 2 The Information content is calculated from a multiple sequence alignment. Result is a graphical visualization of sequence conservation where: Total height at a position is the Information Content Height of single letter is proportional to the frequency of that letter Mutiple alignment of 3 protein sequences: Seq1: A L R K P Q R T Seq2: A V R H I L L I Seq3: A I K V H N N T Pos1: I = [1*log2(1)]+ 4.32 = log2(20) = 4.32 Pos2: I = [1/3*log2(1/3)+ 1/3*log2(1/3)+ 1/3*log2(1/3)] + 4.32 = 2.73 Pos3: I = [2/3*log2(2/3)+ 1/3*log2(1/3) + 4.32 = 3.38

16-Nov-18 Logo Plot Exon

BLAT genome Browser ”Details” 16-Nov-18 BLAT genome Browser ”Details” Correct splice site ?

BLAT genome Browser ”Details” 16-Nov-18 BLAT genome Browser ”Details” Donor site | Acceptor site exon... . G | GT ...intron ...AG | exon...

16-Nov-18 Blat genome Browser

BLAT genome Browser ”Browser” 16-Nov-18 BLAT genome Browser ”Browser” Base, Center & Zoom Known genes Predictions RNA EST Expression Conservation

16-Nov-18 Genome browsers 

16-Nov-18 Genome browsers 

BLAT genome Browser Center & zoom 16-Nov-18 BLAT genome Browser Center & zoom

BLAT genome Browser Center & zoom 16-Nov-18 BLAT genome Browser Center & zoom Selected number of tracks Forward/reverse direction

BLAT genome Browser Sequence Orthologs 16-Nov-18 BLAT genome Browser Sequence Orthologs

BLAT genome Browser Sequence Orthologs 16-Nov-18 BLAT genome Browser Sequence Orthologs “klick”

BLAT genome Browser Sequence Orthologs 16-Nov-18 BLAT genome Browser Sequence Orthologs

BLAT genome Browser Sequence Orthologs 16-Nov-18 BLAT genome Browser Sequence Orthologs

BLAT genome Browser Sequence Orthologs 16-Nov-18 BLAT genome Browser Sequence Orthologs

16-Nov-18 SNPs

Single Nucleotide Polymorphism SNP SNPs can be located anywere in the genome non synomous (nsSNP) i.e. amino acid is changed (shown below ) Synomous SNP does not affect the the protein V I T P An amino acid is coded by 3 nucleotides Valine (V): GTC Humans are diploid: cells have 2 homologous copies of each chromosome i.e. 2*23 chromosomes. Haploid cells only 23 chromosomes (sex-cells)

Diploid organism - most mammals An example of two homologous copies of ex chromosome 9 within a cell A chromosome from mother A chromosome from father If the red strand is the plus-strand: C;T (or T;C but we write it alphabetical) If the green strand is the minus strand: G;A but we write it as G;A

16-Nov-18 SNPs

16-Nov-18 SNPs

Exercise Basic understanding of the graphics 16-Nov-18 Exercise Basic understanding of the graphics Effect of Single Nucleotide Polymorphisms (SNPs) Finding Orthologue genes Identify chromosomal locus for a gene