Download presentation
Presentation is loading. Please wait.
Published byHilary Richardson Modified over 9 years ago
1
How to access genomic information using Ensembl Damian Smedley and Xosé Fernández Ensembl Project European Bioinformatics Institute Cambridge, UK November 2004
2
2 of 45 Schedule Today Introduction to the Ensembl system Hands-on examples to introduce the system Evaluating genes and transcripts Variation in Ensembl (SNPs, haplotypes) Tomorrow Data mining with EnsMart Comparative genomics and proteomics in Ensembl BioMart Advanced topics (Upload your own data, DAS)
3
3 of 45 Our goal
4
4 of 45 Other ordering data to 26,720 overlapping clones From 325,109 initial contigs Assembly non-redundant, “virtual contig” view
5
finished BAC draft sequence assembly WGS fragment pUCs avg size 2-4 kb Bentley et al 2001 Bruls et al 2001 McPherson et al 2001 Montgomery et al 2001 Tilford et al 2001 map Osoegawa et al 2001 fragment BACs bacterial artificial chromosomes avg size 150 kb Shizuya et al 1992 Dib et al 1996 Deloukas et al 1998 Mapping and Sequencing the human genome
6
Status of the human sequence finished red /orange ~96% (99.999% accurate) 30-40% repetitive elements ( eg Alpha satellite, Alu repeats ) All known genes, correctly identified (99.74%) heterochromatin ~4% grey Assembled draft sequence totals 2.85 Gb
7
7 of 45 Human genome: Current status 22,287 'gene loci‘ defined, consisting of 19,599 protein-coding genes in the human genome and 2,188 DNA additional segments ‘predicted’ to be protein-coding genes –1183 genes ‘were born’ in the last 60-100 My –~ 30 genes ‘died’ in a similar time period Finishing the euchromatic sequence of the human genome, Nature 431:931-45 (2004)
8
8 of 45 Ensembl - project aims funded to provide metazoan genomes to the world aims to provide the world’s best automated genome annotation a leading group for human and mouse analysis all software, data and results freely available
9
9 of 45 Ensembl - project background group split between EBI and Sanger mainly Wellcome Trust funded largest dedicated compute in biology in Europe developer community > 100 people, including companies
10
10 of 45 Freely-available Community development. – >51 Ensembl installs worldwide. – Both public and commercial, e.g. Gramene (CSHL)Gramene Fugu-sg (ICMB)Fugu-sg Ciona-sg (Temasek)Ciona-sg Ensembl – Open source
11
11 of 45 Analysis DB CPU Final DB Supporting Databases SNP Manual Annotation Ensembl
12
12 of 45 Genome browsing why present the whole genome? Explore what is in a chromosome region See features in and around a specific gene Search & retrieve across the whole genome Investigate genome organization Compare to other genomes
13
13 of 45 Ensembl – public site + installable system Genome browsers NCBI Map Viewer UCSC Human Genome Browser http://www.ensembl.org http://www.ncbi.nlm.nih.gov/mapview http://genome.ucsc.edu
14
14 of 45 Introduction to the Ensembl web site Ensembl … … takes genomic sequence assemblies human build 34, mouse, rat, Fugu,mosquito adds annotation and links automated process presents all the data on a web site
15
15 of 45 Known genesNovel genes where? genomic structure? transcripts(s)? protein(s)? orthologues? attach useful links how to predict? require evidence transcripts(s)? protein(s)? orthologues? attach useful links Annotation: genes
16
16 of 45 Annotation: other features markers and SNPs cytogenetic bands repeated sequences ESTs & other sequence records where do they show sequence similarity? regions homologous to other species
17
17 of 45 How to get started … … Species homepage Site map Map View Text search BLAST SSAHA Disease View
18
Homepage
19
Site map
20
MapView AnchorView
21
BLAST and SSAHA
23
23 of 45 Regions, maps and markers MarkerView SNPView ContigView CytoView SyntenyView MultiContigView
24
Ensembl ContigView
25
ContigView close-up Evidence Transcripts red & black (Ensembl predictions) Blue (Vega) Customising & short cuts Pop-up menu
26
ContigView - Chromosome 20 close-up Manual annotation via Vega Ensembl predictions Ensembl EST-based predictions Forward strand Reverse strand Other chromosomes with manual annotation from http://vega.sanger.ac.uk : 6, 7, 9, 10, 13, 14, 20, 22, X
27
CytoView
28
GeneSNP View
29
MarkerView SNPView
30
Synteny View
31
MultiContig View
32
32 of 45 Genes & gene products GeneView TransView ExonView ProteinView FamilyView DomainView GOView DiseaseView
33
Ensembl GeneView
34
TransView ExonView
35
Protein View
36
Family View
37
GOView
38
DiseaseView
39
39 of 45 Data retrieval EnsMart Data sets on ftp site MySQL queries of databases Perl API access to databases Export View
41
EnsMart
42
42 of 45 Mouse differences Genomic sequence assembly based on whole genome shotgun, with finished ‘stitched’ BACs BACs are shown in CytoView (FPC map), but for most no sequence is available
43
Mouse CytoView
44
44 of 45 Help! context sensitive help pages - click access other documentation via generic home page email the helpdesk HelpDesk / Suggestions
45
45 of 45 Thanks Ensembl Team
46
Database Schema and Core API Arne Stabenau Yuan Chen Ian Longden Craig Melsopp Glenn Proctor Daniel Ríos Guy Slater Distributed Annotation System Andreas Kähäri Project Leader Ewan Birney (EBI) Tim Hubbard (Sanger) Ensembl Web Team James Stalker Fiona Cunningham James Smith Vega Web Team Patrick Meidl Steve Trevianon Analysis and Annotation Pipeline Val Curwen Steve Searle Dan Andrews Mario Caccamo Laura Clarke Martin Hammond Jan Hinnerck-Vogel Kevin Howe Vivek Iyer Kerstin Jekosch Felix Kokocinski Simon White User Support Xosé Mª Fernández Michael Schuster Comparative Genomics Abel Ureta-Vidal Javier Herrero Sánchez Jessica Severin Cara Woodwark EnsMart & BioMart Arek Kasprzyk Damian Keefe Darin London Damian Smedley Ensembl Team November 2004
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.