Genome Browsers UCSC (Santa Cruz, California) and Ensembl (EBI, UK)

Slides:



Advertisements
Similar presentations
Introduction to genomes & genome browsers
Advertisements

The Organization of Cellular Genomes Complexity of Genomes Chromosomes and Chromatin Sequences of Genomes Bioinformatics As we have discussed for the last.
© Wiley Publishing All Rights Reserved. Using Nucleotide Sequence Databases.
Chap. 6 Problem 2 Protein coding genes are grouped into the classes known as solitary (single) genes, and duplicated or diverged genes in gene families.
Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes BNFO 602/691 Biological Sequence Analysis Mark Reimers, VIPBG.
Describe the structure of a nucleosome, the basic unit of DNA packaging in eukaryotic cells.
Peter Tsai, Bioinformatics Institute.  University of California, Santa Cruz (UCSC)  A rapid and reliable display of any requested portion of genomes.
CS177 Lecture 9 SNPs and Human Genetic Variation Tom Madej
Predicting the Function of Single Nucleotide Polymorphisms Corey Harada Advisor: Eleazar Eskin.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. CHAPTER 18 LECTURE SLIDES.
Genome Structure 12 Jan, Nature of DNA Transformation (uptake of foreign DNA) in prokaryotes and eukaryotes has repeatedly shown that DNA is hereditary.
ECE 501 Introduction to BME
Genome Browsers Ensembl (EBI, UK) and UCSC (Santa Cruz, California)
How to access genomic information using Ensembl August 2005.
Genome Browsing with the UCSC Genome Browser
Genomes summary 1.>930 bacterial genomes sequenced. 2.Circular. Genes densely packed Mbases, ,000 genes 4.Genomes of >200 eukaryotes (45.
FROM GENE TO PROTEIN: TRANSCRIPTION & RNA PROCESSING Chapter 17.
Genome organization Eukaryotic genomes are complex and DNA amounts and organization vary widely between species.
What is genomics? Study of genomes. What is the genome? Entire genetic compliment of an organism.
Gene Structure and Identification
Chapter 19: Eukaryotic Genomes Most gene expression regulated through transcription/chromatin structure Most gene expression regulated through transcription/chromatin.
Control of Gene Expression Eukaryotes. Eukaryotic Gene Expression Some genes are expressed in all cells all the time. These so-called housekeeping genes.
The Ensembl Gene set The “Genebuild” 21 April 2008.
Introduction to genomes Content  the human genome CNVs SNPs Alternative splicing  genome projects Celia van Gelder CMBI UMC Radboud June 2009
Eukaryotic Gene Expression The “More Complex” Genome.
Genome Annotation BBSI July 14, 2005 Rita Shiang.
Introduction to genomes & genome browsers Content  Introduction  The human genome  Human genetic variation SNPs CNVs Alternative splicing  Browsing.
Introduction to genomes & genome browsers Content  Introduction  The human genome  Human genetic variation SNPs CNVs Alternative splicing  Browsing.
UCSC Genome Browser 1. The Progress 2 Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools.
Genomes and Their Evolution. GenomicsThe study of whole sets of genes and their interactions. Bioinformatics The use of computer modeling and computational.
GenomesGenomes Chapter 21 Genomes Sequencing of DNA Human Genome Project countries 20 research centers.
Mutation And Natural Selection how genomes record a history of mutations and their effects on survival Tina Hubler, Ph.D., University of North Alabama,
The Human Genome (part 1 of 2) Wednesday, November 5, 2003 Introduction to Bioinformatics ME: J. Pevsner
DNA PACKAGING. 8 histones make up the nucleosome core DNA wraps twice around the 8 histones Histone 1 helps maintain the nucleosome DNA is negatively.
CS177 Lecture 10 SNPs and Human Genetic Variation
Ch. 21 Genomes and their Evolution. New approaches have accelerated the pace of genome sequencing The human genome project began in 1990, using a three-stage.
Genomes & their evolution Ch 21.4,5. About 1.2% of the human genome is protein coding exons. In 9/2012, in papers in Nature, the ENCODE group has produced.
Chapter 21 Eukaryotic Genome Sequences
Sackler Medical School
Eukaryotic Genomes  The Organization and Control of Eukaryotic Genomes.
Mark D. Adams Dept. of Genetics 9/10/04
Introduction to genomes Content  the human genome CNVs SNPs Alternative splicing  genome projects Celia van Gelder CMBI UMC Radboud June 2009
Control of Eukaryotic Genome
Eukaryotic Genomes: The Organization and Control.
David Sadava H. Craig Heller Gordon H. Orians William K. Purves David M. Hillis Biologia.blu B – Le basi molecolari della vita e dell’evoluzione The Eukaryotic.
1 From Mendel to Genomics Historically –Identify or create mutations, follow inheritance –Determine linkage, create maps Now: Genomics –Not just a gene,
Bioinformatics Workshops 1 & 2 1. use of public database/search sites - range of data and access methods - interpretation of search results - understanding.
Lecture Series 8 The Eukaryotic Genome and Its Expression
A guided tour of Ensembl This quick tour will give you an outline view of what Ensembl is all about. You will learn: –Why we need Ensembl –What is in the.
The Secret of Life! DNA. 2/4/20162 SOMETHING HAPPENS GENE PROTEIN.
UCSC Genome Browser Zeevik Melamed & Dror Hollander Gil Ast Lab Sackler Medical School.
Chapter 19 The Organization & Control of Eukaryotic Genomes.
Accessing and visualizing genomics data
Genomes at NCBI. Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools lists 57 databases.
Using public resources to understand associations Dr Luke Jostins Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015.
CAMPBELL BIOLOGY IN FOCUS © 2014 Pearson Education, Inc. Urry Cain Wasserman Minorsky Jackson Reece 18 Genomes and Their Evolution Questions prepared by.
Who is smarter and does more tricks you or a bacteria? YouBacteria How does my DNA compare to a prokaryote? Show-off.
Aim: How is DNA organized in a eukaryotic cell?. Why is the control of gene expression more complex in eukaryotes than prokaryotes ? Eukaryotes have:
Lecture/Lab 7.31
Objective: I can explain how genes jumping between chromosomes can lead to evolution. Chapter 21; Sections ; Pgs Genomes: Connecting.
Week-6: Genomics Browsers
Ensembl Database and Web Browser
Genomes and Their Evolution
SGN23 The Organization of the Human Genome
Fig Figure 21.1 What genomic information makes a human or chimpanzee?
Epigenetics Study of the modifications to genes which do not involve changing the underlying DNA
Eukaryotic Genomes: The Organization and Control.
Gene Density and Noncoding DNA
with the Ensembl Genome Browser
Chapter 6: Transcription and RNA Processing in Eukaryotes
Presentation transcript:

Genome Browsers UCSC (Santa Cruz, California) and Ensembl (EBI, UK)

Protein coding genes RNA genes (rRNA, snRNA, snoRNA, miRNA, tRNA) Structural DNA (centromeres, telomeres) Regulation-related sequences (promoters, enhancers, silencers, insulators) Parasite sequences (transposons) Pseudogenes (non-functional gene-like sequences) Simple sequence repeats Eukaryotic Genomes: Not only collections of genes

Blue: Prokaryotes Black: Unicellular eukaryotes Other colors: Multicellular eukaryotes (red = vertebrates) Eukaryotic Genomes: High fraction non-coding DNA Bron: Mattick, NRG, 2004

3 billion basepairs (3Gb) 22 chromosome pairs + X en Y chromosomes Chromosome length varies from ~50Mb to ~250Mb About protein-coding genes –compare with ~14000 for fruitfly en ~19000 for Nematode C. elegans Human Genome

Human genome Bron: Molecular Biology of the Cell (4 th edition) (Alberts et al., 2002) Only 1.2% codes for proteins, 3.5-5% is under selection Long introns, short exons Large spaces between genes More than half exists of repetitive DNA

Variation Along Genome sequence Nucleotide usage varies along chromosomes –Protein coding regions tend to have high GC levels Genes are not equally distributed across the chromosomes –Housekeeping generally in gene-dense areas –Gene-poor areas tend to have many tissue specific genes Bron: Ensembl

Chromosome organisation Bron: Lodish (4 th edition) DNA packed in chromatin Active genes in less dense chromatin (beads-on-a-string) Non-active genes often in densely packed chromatine (30-nm fiber) Gene regulation by changing chromatin density, methylation/acetylation of the histones Limited availability of chromatin information in genome browsers (post transcriptional modifications are currently under investigation with ChIP-on- chip experiments

Genome browsers UCSC NCBI Ensembl

Genome Browsing With the UCSC Genome Browser

UCSC Genome browser

Choose a species, an assembly and a gene

Gene search results

Genome browser

Genomic Datatypes (Tracks)

Transcription data rather complicated

Browser → Gene record

Gene record

Gene record (2)

Gene record (3)

Gene record (4) “best hit”

Gene record (5)

Genomic elements Genome browsers can be used to examine other things –Genomic sequence conservation –Pseudogenes –Duplications en deletions of pieces chromosome (Copy Number Variations, CNVs)

Genomic Sequence Conservation Not only protein coding parts are conserved in evolution Conserved non-coding genomic sequences can be involved in gene regulation (enhancers, silencers, insulators) With the UCSC browser one can examine genomic conservation

Genomic Conservation (UCSC)

Pseudogenes Pseudogenes “look” like (are homologous to) protein- coding genes, but are non-functional Two types: –Unprocessed pseudogenes (loss of function) –Processed pseudogenes (mRNAs that are retrotranscribed onto the genome  they miss introns and sometimes have a polyA) The UCSC contains various databases of pseudogenes: –Yale pseudogenes (both types pseudogenes) –Vega pseudogenes (both types pseudogenes) –Retroposed genes (only processed pseudogenes)

Pseudogenes (UCSC)

Copy Number Variation People do not only vary at the nucleotide level (SNPs); short pieces genome can be present in varying number of copies (Copy Number Polymorphisms (CNPs) or Copy Number Variants (CNVs) When there are genes in the CNV areas, this can lead to variations in the number of gene copies between individuals With the UCSC browser CNVs can be examined

Copy Number Variation (UCSC)

Finding a sequence in the genome

BLAT – Search page

BLAT - Results

BLAT – “Details”

BLAT – “Browser”

Genome browsers UCSC Ensembl

Genome Browsing With the Ensembl Genome browser

Ensembl Genome browser

Het Human Genome

MapView – Overview chromosome

ContigView – Zooming in (compare UCSD)

ContigView (2)

GeneView – Gene record

TransView - mRNA Transcript

TransView - mRNA Transcript (2)

Alternative Transcripts Bron: Wikipedia (

GeneView - Show Alternative Transcripts

GeneSpliceView - Alternative Transcripts

Single Nucleotide Polymorphisms (SNPs) Sequence variations within a species Similar to mutations, but are simultaneously present in the population, and generaly have little effect Are being used as genetic markers (a genetic disease is e.g. associated with a SNP) ENSEMBL offers a nice SNP view

GeneView - Show SNPs

GeneSNPView - SNPs

GeneView - Show Protein

ProtView - Protein

ProtView - Protein Sequence

ProtView – Search proteins with the same domains

DomainView – Proteins with a certain domain (Interpro = SMART + PFAM + others)

ProtView - Find Proteins In the Same Protein Family

FamilyView – Alignments of homologous proteins

Finding Human Genes

Finding a human gene (2)

Blast

Blast (2)

UCSC vs Ensembl: Which is better ? They more or less contain the same information UCSC is a bit easier in use Ensembl gives more detailed information and more flexible data export Other small differences in data (e.g. UCSC has more extensive genomic conservation data) Whatever your are familiar with !!