Download presentation
Presentation is loading. Please wait.
1
Introduction to Genes and Genomes with Ensembl
2
Large amounts of raw DNA sequence data
CGGCCTTTGGGCTCCGCCTTCAGCTCAAGACTTAACTTCCCTCCCAGCTGTCCCAGATGACGCCATCTGAAATTTCTTGGAAAC ACGATCACTTTAACGGAATATTGCTGTTTTGGGGAAGTGTTTTACAGCTGCTGGGCACGCTGTATTTGCCTTACTTAAGCCCCT GGTAATTGCTGTATTCCGAAGACATGCTGATGGGAATTACCAGGCGGCGTTGGTCTCTAACTGGAGCCCTCTGTCCCCACTAGC CACGCGTCACTGGTTAGCGTGATTGAAACTAAATCGTATGAAAATCCTCTTCTCTAGTCGCACTAGCCACGTTTCGAGTGCTTA ATGTGGCTAGTGGCACCGGTTTGGACAGCACAGCTGTAAAATGTTCCCATCCTCACAGTAAGCTGTTACCGTTCCAGGAGATGG GACTGAATTAGAATTCAAACAAATTTTCCAGCGCTTCTGAGTTTTACCTCAGTCACATAATAAGGAATGCATCCCTGTGTAAGT GCATTTTGGTCTTCTGTTTTGCAGACTTATTTACCAAGCATTGGAGGAATATCGTAGGTAAAAATGCCTATTGGATCCAAAGAG AGGCCAACATTTTTTGAAATTTTTAAGACACGCTGCAACAAAGCAGGTATTGACAAATTTTATATAACTTTATAAATTACACCG AGAAAGTGTTTTCTAAAAAATGCTTGCTAAAAACCCAGTACGTCACAGTGTTGCTTAGAACCATAAACTGTTCCTTATGTGTGT ATAAATCCAGTTAACAACATAATCATCGTTTGCAGGTTAACCACATGATAAATATAGAACGTCTAGTGGATAAAGAGGAAACTG GCCCCTTGACTAGCAGTAGGAACAATTACTAACAAATCAGAAGCATTAATGTTACTTTATGGCAGAAGTTGTCCAACTTTTTGG TTTCAGTACTCCTTATACTCT AACTAAGAATTTAAGGCTGGG CCAGAAGTTTGAGACCAGCCT GTGCCTGTAATCCCAGCTACA ATGCCACTGCACTCTAGCCTG TAAAAATGATCTAGGACCCCCGGAGTGCTTTTGTTTATGTAGCT CGTGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCGAG GGCCAACATGGTGAAACCCTATCTCTACTAAAAATACAAAAAAT CGGGAGGTGGAGGCAGGAGAATCGCTTGAACCCTGGAGGCAGAG GGCCACATAGCATGACTCTGTCTCAAAACAAACAAACAAACAAA Large amounts of raw DNA sequence data TACCATATTAGAAATTTAA GTGGGCGGATCACTTGAGG GTGCTGCGTGTGGTGGTGC GTTGCAGTGAGCCAAGATC AAACTAAGAATTTAAAGTT AATTTACTTAAAAATAATGAAAGCTAACCCATTGCATATTATCACAACATTCTTAGGAAAAATAACTTTTTGAAAACAAGTGAG TGGAATAGTTTTTACATTTTTGCAGTTCTCTTTAATGTCTGGCTAAATAGAGATAGCTGGATTCACTTATCTGTGTCTAATCTG TTATTTTGGTAGAAGTATGTGAAAAAAAATTAACCTCACGTTGAAAAAAGGAATATTTTAATAGTTTTCAGTTACTTTTTGGTA TTTTTCCTTGTACTTTGCATAGATTTTTCAAAGATCTAATAGATATACCATAGGTCTTTCCCATGTCGCAACATCATGCAGTGA TTATTTGGAAGATAGTGGTGTTCTGAATTATACAAAGTTTCCAAATATTGATAAATTGCATTAAACTATTTTAAAAATCTCATT CATTAATACCACCATGGATGTCAGAAAAGTCTTTTAAGATTGGGTAGAAATGAGCCACTGGAAATTCTAATTTTCATTTGAAAG TTCACATTTTGTCATTGACAACAAACTGTTTTCCTTGCAGCAACAAGATCACTTCATTGATTTGTGAGAAAATGTCTACCAAAT TATTTAAGTTGAAATAACTTTGTCAGCTGTTCTTTCAAGTAAAAATGACTTTTCATTGAAAAAATTGCTTGTTCAGATCACAGC TCAACATGAGTGCTTTTCTAGGCAGTATTGTACTTCAGTATGCAGAAGTGCTTTATGTATGCTTCCTATTTTGTCAGAGATTAT TAAAAGAAGTGCTAAAGCATTGAGCTTCGAAATTAATTTTTACTGCTTCATTAGGACATTCTTACATTAAACTGGCATTATTAT TACTATTATTTTTAACAAGGACACTCAGTGGTAAGGAATATAATGGCTACTAGTATTAGTTTGGTGCCACTGCCATAACTCATG CAAATGTGCCAGCAGTTTTACCCAGCATCATCTTTGCACTGTTGATACAAATGTCAACATCATGAAAAAGGGTTGAAAAAAGGA ATATTTTAATAGTTTTCAGTTACTTTTTGGTATTTTTCCTTGTACTTTGCATAGATTTTTCAAAGATCTAATAGATATACCCGA
3
Making Sense out of Sequence …
4
The Ensembl genome browser: making it interesting
The ENCODE (ENCyclopedia Of DNA Elements) project Science 306: (2004) Genes Variation Regulatory elements 9
5
Vertebrate species on Ensembl
Mostly vertebrates
6
Non‐vertebrates on Ensembl genomes
Fungi Bacteria Protists Metazoa Plants
7
Ensembl and EnsemblGenomes
8
Ensembl gene models Automatic annotation Manual annotation
9
Automatic gene annotation
Genome-wide determination using the Ensembl automated pipeline Predictions based on the genomic sequence (ab initio) Predictions based on experimental (biological) data ESTs RNAseq data cDNA and protein alignments (from sequence DBs)
10
Biological Evidence International Nucleotide Sequence databases
Protein sequence databases Swiss-Prot: manually curated TrEMBL: unreviewed translations NCBI RefSeq Manually annotated proteins and mRNAs (NP, NM)
11
Manual gene annotation
Gene determination on a case by case basis by a curator • Genome-wide Genes list h v
12
Ensembl automatic annotation
13
Automatic annotation Many species (>60) Genome-wide at once Manual annotation Few species (Hs, Mm, Dr) Gene-by-gene
14
Golden transcripts Identical annotation • Higher confidence and quality
gf 3’ UTR 5’ UTR UTR Intron Exon Exons are drawn as boxes – filled boxes are coding and unfilled boxes are untranslated. Introns are drawn as lines.
15
CCDS transcripts Consensus coding DNA sequence set
Agreement between EBI, WTSI, UCSC and NCBI • CCDS transcript vg
16
Higher quality transcripts
CCDS transcripts (protein-coding only) Ensembl/Havana merged transcripts Both a limited number of species
17
Ensembl stable IDs ENSG########### Ensembl Gene ID
ENST########### Ensembl Transcript ID ENSP########### Ensembl Peptide ID ENSE########### Ensembl Exon ID For non‐human species a suffix is added: MUS (Mus musculus) for mouse ENSMUSG### DAR (Danio rerio) for zebrafish: ENSDARG###
18
NCBI http://www.youtube.com/ncbinlm Go to www.youtube.com
Search “NCBI tutorial general”
19
The National Center for Biotechnology Information
Bethesda,MD Created in 1988 as a part of the National Library of Medicine at NIH Establish public databases Research in computational biology Develop software tools for sequence analysis – Disseminate biomedical information
20
Three international nucleotide sequence databases
21
Selected NCBI Databases
Biomedical literature PubMed free Medline PubMed Central full text online access NCBI Bookshelf online biomedical textbooks Biomolecular Databases Nucleotide GenBank submitted sequence records RefSeq curated NCBI reference sequences Protein GenBank and RefSeq translations, outside protein dbSNP small scale genetic variations Structure biomolecular 3-D structures MMDB NCBI’s 3D structure database GEO microarray expression data SRA next-generation sequence data
22
GenBank & RefSeq
23
RefSeq: NCBI’s Derivative Sequence Database
Experimentally verified / curated transcripts and proteins NM_, NP_ accession numbers Model transcripts and proteins XM_, XP_ accession numbers Assembled Genomic Regions (contigs) NT_, NW_ accession numbers Chromosome records NC_, AC_ accession numbers RefSeqGene Records NG_ accession numbers (NG_ also used pseudo genes and other fixed genomic sequences) Draft whole genome shotgun assemblies (microbial) NZ_ accession numbers Microbial proteins NP_, YP_, ZP_ accessions
24
UCSC Genome Browser
25
GeneCards
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.