Download presentation
Presentation is loading. Please wait.
Published byBrenda Kelly Modified over 9 years ago
1
1 of 42 Browsing Genes and Genomes with Ensembl Maria Wilbe Department of Animal Breeding and Genetics, SLU, Sweden Maria.Wilbe@hgen.slu.se
2
2 of 42 Several lecture notes taken from: Bert Overduin Ensembl User Support EMBL Outstation European Bioinformatics Institute Wellcome Trust Genome Campus Hinxton, Cambridge, UK Alvaro Martinez Barrio Linneaus Centre for Bioinformatics, Uppsala University, Sweden
3
3 of 42 What is Ensembl A software system which produces and maintains automatic annotation on selected eukaryotic genomes. Perform automatic analysis of new genome data Analysis and annotation maintained on the current data Presentation of the analysis to all via the web Ensembl will concentrate on vertebrate genomes, but other groups have adapted the system for use with plant and fungal genomes Powered by Ensembl shows a list of projects that use Ensembl technologyPowered by Ensembl
4
4 of 42 Ensembl - Organisation Joint project between European Bioinformatics Institute (EMBL-EBI) and Wellcome Trust Sanger Institute Started in 1999 for the Human Genome Project Funded primarily by the Wellcome Trust, additional funding by EMBL, EU, NIH-NIAID, BBSRC and MRC Team of ca. 40 people, led by Ewan Birney (EBI) and Tim Hubbard (Sanger) Uses the largest dedicated computer system in biology in Europe
5
5 of 42 A Bit of History 1995Haemophilus influenzae 1.8 Mb 1996Yeast 12 Mb 1998C. elegans100 Mb 1999Fruit fly125 Mb 2000Arabidopsis115 Mb 2001Human (draft) 2002Mouse 2.6 Gb 2004Human (“finished”) 3 Gb Sequenced genomes
6
6 of 42 Sequencing genomes The term DNA sequencing is a method for determining the order of the nucleotide bases (A,T,C,G)
7
7 of 42 Ensembl genomes (Ensembl release 49 - March 2008)
8
8 of 42 Species in Ensembl CAMBRI ORDO SIL DEV CARBON PER TRIA JURA CRETAC TERTIA 570 505438408360286245208144 65 MYBP FISHES BIRDS REPTILES MAMMALS PLACENTALS MONOTREMES MARSUPIALS OTHER BIRDS PALEOGNATHS PASSERINES CROCODILES TURTLES LIZARDS AMPHIBIANS TELEOSTS SHARKS RAYS LATIMERIA BICHIR/POLYPTERUS LUNGFISHES AGNATHANS NON-VERTEBRATES
9
9 of 42 Ensembl - Goals Provide automatic annotation of genomic sequence Integrate other biological data Make data available to all via the web
10
10 of 42 Annotation Wikipedia : Genome annotation is the process of attaching biological information to sequences. It consists of two main steps: 1.identifying elements on the genome, a process called Gene Finding: - ORFs and their localisation - gene structure - coding regions - location of regulatory motifs 2. attaching biological information to these elements. - biochemical function - biological function - involved regulation and interactions - expression
11
11 of 42 The big Genome Browsers Ensembl Genome browser http://www.ensembl.org NCBI Map Viewer http://www.ncbi.nlm.nih.gov/mapview/ UCSC Genome Browser http://genome.ucsc.edu
12
12 of 42 Ensembl / NCBI Map Viewer / UCSC All allow access of multiple organisms All are based on same data Annotations are different Assembly versions may differ Some organisms specific to only a certain browser
13
13 of 42 NCBI Map Viewer - Opening page
14
14 of 42 NCBI Map Viewer - Result page
15
15 of 42 UCSC Genome Browser - Opening page
16
16 of 42 UCSC Genome Browser - Search page
17
17 of 42 UCSC Genome Browser - Default view
18
18 of 42 UCSC Genome Browser - Options
19
19 of 42 UCSC Genome Browser - BLAT search
20
20 of 42 Ensembl Genome Browser -Opening page
21
21 of 42 Ensembl Genome Browser - Search view Choose human gene
22
22 of 42 Ensembl Genome Browser - Gene view
23
23 of 42 Ensembl Genome Browser - BLAST
24
24 of 42 What Distinguishes Ensembl from the UCSC and NCBI Browsers? Automatic annotation for those species for which no manually curated gene set exists Direct database access and programmatic access via the Perl API Not only the data, but also the software source code is open source
25
25 of 42 Which Data Are Available? Genomic sequence Transcript and peptide models External references Variation data: SNPs Mapped cDNAs, peptides, micro array probes, BAC clones etc. Other features of the genome: cytogenetic bands, markers, repeats etc. Comparative data: orthologues and paralogues, protein families, whole genome alignments, syntenic regions Regulatory data: “best guess” set of regulatory elements Data from external sources (DAS)
26
26 of 42 Genomic sequence Gene location
27
27 of 42 Genomic sequence Export
28
28 of 42 Transcript and peptide info Click to view
29
29 of 42 External references Click to view
30
30 of 42 Single nucleotide polymorphisms (SNPs) Two human genomes differ by ~0.1% Polymorphism: a DNA variation in which each possible sequence is present in at least 1% of people Most polymorphisms (~90%) take the forms of SNPs: variations that involve just one nucleotide ~1 out of every 300 bases in the human genome ~10 million in the human genome
31
31 of 42 Practical Applications Disease diagnosis Association studies Forensic testing Population genetics and evolutionary studies Marker-assisted selection
32
32 of 42 SNPs in Ensembl - Types Non-synonymousIn coding sequence, resulting in an aa change Synonymous In coding sequence, not resulting in an aa change FrameshiftIn coding sequence, resulting in a frameshift Stop lostIn coding sequence, resulting in the loss of a stop codon Stop gainedIn coding sequence, resulting in the gain of a stop codon Essential splice site In the first 2 or the last 2 basepairs of an intron Splice site1-3 bps into an exon or 3-8 bps into an intron UpstreamWithin 5 kb upstream of the 5'-end of a transcript Regulatory regionIn regulatory region annotated by Ensembl 5' UTRIn 5' UTR IntronicIn intron 3' UTRIn 3' UTR DownstreamWithin 5 kb downstream of the 3'-end of a transcript IntergenicMore than 5 kb away from a transcript
33
33 of 42 SNPs in Ensembl ContigView: SNPs in genomic context
34
34 of 42 SNPs in Ensembl
35
35 of 42 Biological Evidence UniProt/Swiss-Prot A manually curated database and therefore of highest accuracy NCBI RefSeq A partially manually curated database UniProt/TrEMBL Automatically annotated translations of EMBL coding sequence (CDS) features EMBL / GenBank / DDBJ Primary nucleotide sequence repository All Ensembl gene predictions are based on experimental evidence:
36
36 of 42 The Ensembl Genebuild Genome assembly Computer programs Experimental evidence Ensembl Genes + +
37
37 of 42 Ensembl Identifiers ENSG###Ensembl Gene ID ENST###Ensembl Transcript ID ENSP###Ensembl Peptide ID ENSE###Ensembl Exon ID ENSF###Ensembl Family ID ENSR###Ensembl Regulatory Feature ID For other species than human a suffix is added: MUS for mouse (Mus musculus) : ENSMUSG###, DAR for zebrafish (Danio rerio) : ENSDARG### etc.etc. For imported genes Ensembl uses the original identifiers
38
38 of 42 Pre! and Archive! Sites
39
39 of 42 Powered by Ensembl
40
40 of 42 Ensembl – Open Source Data and software freely available More than 50 installs worldwide Academia and industry Local or available via the web Mirrors with Ensembl data, e.g. http://ensembl.genome.tugraz.at/index.html http://ensembl.genome.tugraz.at/index.html or user projects with own data
41
41 of 42 Ensembl Accounts Personalise Ensembl by saving bookmarks, view configurations and homepage preferences in a user account Share bookmarks and configurations by setting up groups Please note that all Ensembl data remains free access. It is not necessary to register in order to gain access to Ensembl data!
42
42 of 42 Website Statistics On average 1,000,000 page impressions / week Top 3 species: Top 3 countries:
43
43 of 42 What If I Need Help? Helpdesk: helpdesk@ensembl.org Mailing lists: ensembl-dev@ebi.ac.uk ensembl-announce@ebi.ac.uk Animated tutorials http://www.ensembl.org/common/Workshops_Online
44
44 of 42 Today 1.Ensembl: www.ensembl.org 1.WORKED EXAMPLE: A walk through the main pages of the Ensembl browser, using the EPO (Erythropoietin precursor) gene as an example (Course Homepage). 2.Ensembl Exercise: Answering questions by using Ensembl (Course Homepage). 3.If time, find information about your favorite gene by using Ensembl.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.