Sequence Analysis with Artemis & Artemis Comparison Tool (ACT) South East Asian Training Course on Bioinformatics Applied to Tropical Diseases - 2005 (Sponsored.

Slides:



Advertisements
Similar presentations
Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes.
Advertisements

On line (DNA and amino acid) Sequence Information Lecture 7.
Web Apollo Resources at the National Agricultural Library Christopher Childers NAL ARS USDA i5k.nal.usda.gov.
Sequence Analysis MUPGRET June workshops. Today What can you do with the sequence? What can you do with the ESTs? The case of SNP and Indel.
CHAPTER 15 Microbial Genomics Genomic Cloning Techniques Vectors for Genomic Cloning and Sequencing MS2, RNA virus nt sequenced in 1976 X17, ssDNA.
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
Human Genome Project. Basic Strategy How to determine the sequence of the roughly 3 billion base pairs of the human genome. Started in Various side.
Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to.
Working with Pathogen Genomes
Genome Annotation BCB 660 October 20, From Carson Holt.
Doug Brutlag 2011 Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University School of Medicine Genomics, Bioinformatics.
Doug Brutlag Professor Emeritus Biochemistry & Medicine (by courtesy) Genome Databases Computational Molecular Biology Biochem 218 – BioMedical Informatics.
Wellcome Trust Workshop Working with Pathogen Genomes Module 1 Artemis.
Presentation on genome sequencing. Genome: the complete set of gene of an organism Genome annotation: the process by which the genes, control sequences.
Wellcome Trust Workshop Working with Pathogen Genomes Module 3 Sequence and Protein Analysis (Using web-based tools)
Locating genes in Plasmodium falciparum You have seen how artemis is used to view, analyse and annotate bacterial genomes, but now we are going to move.
Mouse Genome Sequencing
Large-scale genome projects
The Ensembl Gene set The “Genebuild” 21 April 2008.
Databases in Bioinformatics and Systems Biology Carsten O. Daub Omics Science Center RIKEN, Japan May 2008.
Chapter 14 Genomes and Genomics. Sequencing DNA dideoxy (Sanger) method ddGTP ddATP ddTTP ddCTP 5’TAATGTACG TAATGTAC TAATGTA TAATGT TAATG TAAT TAA TA.
Genome Annotation using MAKER-P at iPlant Collaboration with Mark Yandell Lab (University of Utah) iPlant: Josh Stein (CSHL) Matt Vaughn.
Genome Annotation and Databases Genomic DNA sequence Genomic annotation BIO520 BioinformaticsJim Lund Reading Ch 9, Ch10.
NCBI Review Concepts Chuong Huynh. NCBI Pairwise Sequence Alignments Purpose: identification of sequences with significant similarity to (a)
BASys: A Web Server for Automated Bacterial Genome Annotation Gary Van Domselaar †, Paul Stothard, Savita Shrivastava, Joseph A. Cruz, AnChi Guo, Xiaoli.
GeneWise and Artemis Exercises Spliced Alignment using GeneWise Click on the GeneWise hyperlink on the course links page,
Steps in a genome sequencing project Funding and sequencing strategy source of funding identified / community drive development of sequencing strategy.
Wellcome Trust Workshop Working with Pathogen Genomes Module 2 Gene Prediction.
Module 3 Sequence and Protein Analysis (Using web-based tools) Working with Pathogen Genomes - Uruguay 2008.
Genome databases and webtools for genome analysis Become familiar with microbial genome databases Use some of the tools useful for analyzing genome Visit.
CISC667, F05, Lec9, Liao CISC 667 Intro to Bioinformatics (Fall 2005) Sequence Database search Heuristic algorithms –FASTA –BLAST –PSI-BLAST.
NCBI resources II: web-based tools and ftp resources Yanbin Yin Fall 2014 Most materials are downloaded from ftp://ftp.ncbi.nih.gov/pub/education/ 1.
BLOCKS Multiply aligned ungapped segments corresponding to most highly conserved regions of proteins- represented in profile.
Web Databases for Drosophila Introduction to FlyBase and Ensembl Database Wilson Leung6/06.
Sequence Analysis with Artemis and Artemis Comparison Tool (ACT) Carribean Bioinformatics Workshop 18 th -29 th January, 2010.
Srr-1 from Streptococcus. i/v nonpolar s serine (polar uncharged) n/s/t polar uncharged s serine (polar uncharged) e glutamic acid (neg. charge) sserine.
Building WormBase database(s). SAB 2008 Wellcome Trust Sanger Insitute Cold Spring Harbor Laboratory California Institute of Technology ● RNAi ● Microarray.
Generic Database. What should a genome database do? Search Browse Collect Download results Multiple format Genome Browser Information Genomic Proteomic.
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
1 From Mendel to Genomics Historically –Identify or create mutations, follow inheritance –Determine linkage, create maps Now: Genomics –Not just a gene,
Web Apollo Resources at the National Agricultural Library Christopher Childers NAL ARS USDA i5k.nal.usda.gov.
EBI is an Outstation of the European Molecular Biology Laboratory. UniProtKB Sandra Orchard.
Bioinformatics Workshops 1 & 2 1. use of public database/search sites - range of data and access methods - interpretation of search results - understanding.
Johnson - The Living World: 3rd Ed. - All Rights Reserved - McGraw Hill Companies Genomics Chapter 10 Copyright © McGraw-Hill Companies Permission required.
A guided tour of Ensembl This quick tour will give you an outline view of what Ensembl is all about. You will learn: –Why we need Ensembl –What is in the.
SRB Genome Assembly and Analysis From 454 Sequences HC70AL S Brandon Le & Min Chen.
Tools in Bioinformatics Genome Browsers. Retrieving genomic information Previous lesson(s): annotation-based perspective of search/data Today: genomic-based.
Finding genes in the genome
Accessing and visualizing genomics data
Annotation of eukaryotic genomes
What is BLAST? Basic BLAST search What is BLAST?
Welcome to the combined BLAST and Genome Browser Tutorial.
Using DNA Subway in the Classroom Genome Annotation: Red Line.
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
What is BLAST? Basic BLAST search What is BLAST?
Virginia Commonwealth University
bacteria and eukaryotes
Microbial genomics.
Introduction to Genes and Genomes with Ensembl
Human Genome Project.
Basics of BLAST Basic BLAST Search - What is BLAST?
Sequence based searches:
Department of Genetics • Stanford University School of Medicine
Stuff to Do.
Genome Annotation Continued
GEP Annotation Workflow
Today… Review a few items from last class
Genomes and Their Evolution
Gene Annotation with DNA Subway
Vector NTI Introduction
Presentation transcript:

Sequence Analysis with Artemis & Artemis Comparison Tool (ACT) South East Asian Training Course on Bioinformatics Applied to Tropical Diseases (Sponsored by UNDP/World Bank/WHO/TDR) International Centre For Genetic Engineering And Biotechnology, New Delhi, INDIA

Overview of the genome sequencing and sequence analysis. Demonstration of Artemis. Hands on guided exercise in Artemis. Demonstration of ACT. Hands on guided exercise in ACT Generating ACT comparison files Workshop Overview

Wellcome Trust Photo Library The Wellcome Trust Sanger Institute Funded by The Wellcome Trust, a registered charity. Established in 1993 to begin the Human genome project. First Draft (2000) complete (2003-4) Data release policy: All sequence data is released immediately and is freely available via the internet in order to maximise its benefit for research. ftp://ftp.sanger.ac.uk/ Wellcome Trust Photo Library

Generating the complete genome sequence

Infrastructure

Levels of automation Colony picking robots Plasmid preps robots TOTAL:140 ABI3700 ABI3730

Automated sequencing Each ABI reads 96 DNA sequences at once. The machines are run 10 times a day, 7 days a week. Throughput of 1,200 to 1, well plates per day ± 120,000 DNA samples read each day. Each day, the Sanger Institute reads 60 million base pairs. That’s equal to one of the smaller human chromosomes and many times that of an average bacterial genome.

Pathogen Sequencing Unit Bacteria: M. tuberculosis M. leprae Y. pestis S. typhi C. Diphtheriae Bordetella spp. x3 B. pseudomallei S. aureus MRSA S. aureus MSSA E. carrotovora Yeasts and Fungi: Saccharomyces cerevisiae Schizosaccharomyces pombe Aspergillus fumigatus Candida dubliniensis Candida parapsilosis Protozoa: Plasmodium falciparum X3 Plasmodium spp. X5 Leishmania spp. Trypanosoma spp. Eimeria Theileria Babesia The Pathogen Group is funded by the Beowulf Genomics Initiative to sequence the genomes of a wide range of small Eukaryotes and microbes.

Sequencing strategy and assembly

Contiguous sequence DNA pUC clone end sequence physical gap sequence gap Shotgun sequencing – strategy ‘Draft sequence’ Order of contigs? 95% coverage, 4-5x depth.

‘A genome in a day’ ‘15 in a month’ ‘High-quality draft sequence’

Contiguous sequence DNA pUC clone end sequence large clone end sequence physical gap sequence gap Shotgun sequencing – strategy Finished sequence: 100% coverage, 10x depth.

Repeats!!!

Shotgun assembly - Yersinia pestis

Primary DNA sequence Dotter BlastN BlastX Gene finders tRNA scan RepeatsPseudo-genesrRNA Genes tRNA Manual curation

Primary DNA sequence Dotter BlastN BlastX Gene finders tRNA scan RepeatsPseudo-genesrRNA Genes tRNA FastaBlastPPfamPrositePsortSignalPTMHMM Manual curation Manual curation Annotated sequence

PSU Projects Organism Annotated genome Finished genome Database entry Artemis

Sequence viewer and analysis tool –Visualization of sequence features DNA Six frame translation –Perform and view analysis Basic analysis Launch more complex analysis and searches Import and view the results of other searches

Outline of Artemis demonstration Artemis window features Open a genome sequence Changing the view Getting around –Goto Menu –Navigator –Feature Selector Basic analysis –Edit a feature –Fasta search –Show feature plots

Artemis Sliders Drop Down Menus Entry Button Line Main Sequence View Panel Magnified Sequence View Panel Feature Menu Drop Down Menus Entry Button Line Main Sequence View Panel Magnified Sequence View Panel Feature Menu

Artemis

Curating gene models in Artemis Use of multiple lines of evidence

Curating gene models in Artemis Use of FASTA evidence

EST sequencing & mapping AAAAAAAAAA CAP AAAAAAAAAA CAP TTTTTTTTT intron exon 5’UTR M stop 3’UTR EST cDNA mRNA

ESTs Curating gene models in Artemis Use of EST evidence

Curation of gene models in Artemis Mapping proteome fragments to genome

Curation and annotation in Artemis Mapping InterPro domain hits to genome

Finished sequence Gene Finder PHAT Glimmer Orpheus FASTA BLAST EST Primary gene model Annotation of pathogen genomes at the PSU (using ARTEMIS) Complete Annotation Organism-specific gene familiesFunctional classification (GO / Riley) Comparative genomics (using ACT) Refined gene model InterPro scan HMMPfam HMMSMART PRINTS PROSITE ProDom TIGRFAMs Manual curation SignalP TMHMM t-RNA scan

Gene model annotation Gene function

Top tips! Manual annotation. Use a several lines of evidence: - Run several available gene finding programs - Search programs: local (BLAST) and global (FASTA) alignments -Protein domains and motifs: Interpro (Pfam, prosite, SMART etc.) -Transmembrane / signal peptide prediction (TMHMM, SignalP) - Base your annotation on characterised proteins where possible (e.g. UNIPROT entry) - Read the literature (Pubmed entry)

Sanger Front page