How to access genomic information using Ensembl August 2005
2 of 42 GOAL
Status of the human sequence finished red /orange ~96% (99.999% accurate) 30-40% repetitive elements ( eg Alpha satellite, Alu repeats ) All known genes, correctly identified (99.74%) heterochromatin ~4% grey Assembled draft sequence totals 2.85 Gb
4 of 42 Finishing the euchromatic sequence of the human genome, Nature 431: (2004)
5 of 42 Analysis DB CPU Final DB Supporting Databases SNP Manual Annotation Ensembl
6 of 42 Genome browsing why present the whole genome? Explore what is in a chromosome region See features in and around a specific gene Search & retrieve across the whole genome Investigate genome organization Compare to other genomes
7 of 42 Genome browsers NCBI Map Viewer UCSC Human Genome Browser Ensembl – public site + installable system
8 of 42 Introduction to the Ensembl web site Ensembl … … takes genomic sequence assemblies human build 35, mouse, rat, mosquito… adds annotation and links automated process presents all the data on a web site
9 of 42 Basic Genome Annotation Genes –Genomic location –Gene model structures Exons Introns UTRs –Transcript(s) Pseudogenes Non-coding RNA –Protein(s) –Links to other sources of information
10 of 42 Advanced Genome Annotation Cytogenetic bands Polymorphic markers –Sequence Tagged Sites (STS) Genetic variation –Single Nucleotide Polymorphisms (SNPs) –Deletion-Insertion Polymorphisms (DIPs) –Short Tandem Repeats (STRs) Repetitive sequences Expressed Sequence Tags (ESTs) cDNAs or mRNAs from related species Regions of sequence homology
11 of 42 How to get started … … Species homepage Map View Text search BLAST SSAHA
Homepage
MapView
14 of 42 BLAST and SSAHA See blast hit on genome
15 of 42 Query sequence: In which chromosome you get the best hit? Explore the alignment of the query sequence with the genome Is this is a sequence of a gene? If so, which one? Explore the region around this sequence Practical: BLAST and SSAHA practical
16 of 42 Regions, maps and markers MarkerView SNPView GeneSNPView ContigView CytoView SyntenyView MultiContigView
Ensembl ContigView
ContigView close-up Transcripts red & black (Ensembl predictions) Blue (Vega) Pop-up menu
ContigView - Chromosome 20 close-up Manual annotation via Vega Ensembl predictions Ensembl EST-based predictions Chromosomes with manual annotation ( : 1, 6, 7, 9, 10, 13, 14, 16, 18, 19, 20, 22, X and Y
CytoView
GeneSNP View
SNPView
MarkerView
SyntenyView
MultiContigView
26 of 42 Genes & gene products GeneView TransView ExonView ProteinView FamilyView DomainView GOView DiseaseView
Ensembl GeneView
ExonView TransView
Protein View
Family View
GOView
32 of 42 Ensembl practical Type the name of your favorite gene (i.e. BRCA2) and explore all the sections of ensembl for this gene. Has this gene an ortholog in mouse? How many different transcript do we know of this gene? How many exons has the longest transcript? Which functional annotations has this gene? (hint: check at GO annotations Can you find SNPs in this gene?