LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: 6874-6877

Slides:



Advertisements
Similar presentations
Bioinformatics growth curves Medline records Computer power DNA sequences 3-D structures.
Advertisements

Learning Evolution and Phylogeny Through Tree Building Larry Aaronson, Utica College David Treves, Indiana Univ., Southeast Monica Trifas, Jacksonville.
MICB 405 Bioinformatics Mini-Lab #1 – NCBI’s Entrez Dr. Joanne Fox We gratefully acknowledge the funding for the development of these.
© Wiley Publishing All Rights Reserved. How Most People Use Bioinformatics.
IN THE NAME OF GOD. Searching PubMed PubMed Home Page.
On line (DNA and amino acid) Sequence Information Lecture 7.
1 Welcome to the Protein Database Tutorial This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics USC School of Medicine Library.
How to use the web for bioinformatics Molecular Technologies Ethan Strauss X 1171
Sequence Analysis MUPGRET June workshops. Today What can you do with the sequence? What can you do with the ESTs? The case of SNP and Indel.
Essential Bioinformatics and Biocomputing (LSM2104: Section I) Biological Databases and Bioinformatics Software Prof. Chen Yu Zong Tel:
Lecture 2.21 Retrieving Information: Using Entrez.
How to use the web for bioinformatics Molecular Technologies February 11, 2005 Ethan Strauss X 1373
Biological Databases Notes adapted from lecture notes of Dr. Larry Hunter at the University of Colorado.
Course Summary June 2, 2005 Programming Workshop Overview of course (presentation) Protein modeling, part 2 Instructor evaluations.
©CMBI 2007 Search tools Google, MRS, (SRS). ©CMBI 2007 Search tools Google= Thé best generic search and retrieval system MRS= Maarten’s Retrieval System.
Genomic Database - Ensembl Ka-Lok Ng Department of Bioinformatics Asia University.
Essential Bioinformatics and Biocomputing (LSM2104: Section I) Biological Databases and Bioinformatics Software Prof. Chen Yu Zong Tel:
Introduction to Bioinformatics - Tutorial no. 5 MEME – Discovering motifs in sequences MAST – Searching for motifs in databanks TRANSFAC – The Transcription.
©CMBI 2005 Search tools Google, MRS, SRS. ©CMBI 2004 Search tools SRS = Sequence Retrieval System MRS = Maarten’s Retrieval System Google = Thé best generic.
HKUHKU Computer Centre Introduction to SRS Frankie Cheung
Sequence/Structure Alignment Resources from NCBI Steve Bryant Protein Data Bank Rutgers University November 19, 2005.
Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to.
How to use the web for bioinformatics Ethan Strauss X 1171
How do I know the differences and uses of keyword versus subject searching in a database?
Arabidopsis Gene Project GK-12 April Workshop Karolyn Giang and Dr. Mulligan.
Comparative Genomics of Viruses: VirGen as a case study Dr. Urmila Kulkarni-Kale Bioinformatics Centre University of Pune Pune
BTN323: INTRODUCTION TO BIOLOGICAL DATABASES Day2: Specialized Databases Lecturer: Junaid Gamieldien, PhD
Making Sense of DNA and protein sequence analysis tools (course #2) Dave Baumler Genome Center of Wisconsin,
Overview of Bioinformatics A/P Shoba Ranganathan Justin Choo National University of Singapore A Tutorial on Bioinformatics.
Influenza Research Database (IRD): A Web-based Resource for Influenza Virus Data and Analysis Victoria Hunt 1 *, R. Burke Squires 1, Jyothi Noronha 1,
Erice 2008 Introduction to PDB Workshop From Molecules to Medicine: Integrating Crystallography in Drug Discovery Erice, 29 May - 8 June Peter Rose
Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.
LSM3241: Bioinformatics and Biocomputing Lecture 3: Machine learning method for protein function prediction Prof. Chen Yu Zong Tel:
CS 790 – Bioinformatics Introduction and overview.
CZ3253: Computer Aided Drug design Lecture 3: Drug and Cheminformatics Databases Prof. Chen Yu Zong Tel:
Statistical Tool for Identifying Sequence Variations That Correlate with Virus Phenotypic Characteristics in the Virus Pathogen Resource (ViPR) July 22,
BioHealthBase: A Web-based Database and Analysis Resource for Francisella Shubhada Godbole 1, Jyothi Noronha 1, Burke Squires 1, Victoria Hunt 1, Ed Klem.
LSM3241: Bioinformatics and Biocomputing Lecture 1: Introduction Prof. Chen Yu Zong Tel: Room.
CZ5225 Methods in Computational Biology Lecture 9: Biological pathways and pathway simulation Prof. Chen Yu Zong Tel:
CZ3253: Computer Aided Drug design Lecture 1: Drugs and Drug Development Part I Prof. Chen Yu Zong Tel:
Biological databases Exercises. Discovery of distinct sequence databases using ensembl.
Protein Domain Database
LSM3241: Bioinformatics and Biocomputing Lecture 6: Fundamentals of Molecular Modeling Prof. Chen Yu Zong Tel:
Introduction to Oral Cancer Gene Database .資訊三 朱民晃 B .資訊三 陳皓遠 B
CZ5225 Methods in Computational Biology Lecture 2-3: Protein Families and Family Prediction Methods Prof. Chen Yu Zong Tel:
Guidelines for sequence reports. Outline Summary Results & Discussion –Sequence identification –Function assignment –Fold assignment –Identification of.
LSM3241: Bioinformatics and Biocomputing Lecture 7: Molecular Modeling Software Prof. Chen Yu Zong Tel:
CZ5226: Advanced Bioinformatics Lecture 7: Statistical Learning Methods Prof. Chen Yu Zong Tel:
Introduction to Bioinformatics - Tutorial no. 5 MEME – Discovering motifs in sequences MAST – Searching for motifs in databanks TRANSFAC – the Transcription.
Protein databases Petri Törönen Shamelessly copied from material done by Eija Korpelainen and from CSC bio-opas
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Summer Bioinformatics Workshop 2008 BLAST Chi-Cheng Lin, Ph.D., Professor Department of Computer Science Winona State University – Rochester Center
Welcome to the Protein Database Tutorial. This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
Topics in 2 nd Part: Biological Information and Tools. Molecular Modeling Technology and Applications. Computer-aided drug design SMA5422: Special Topics.
TIGER * Biosensor for Emerging Infectious Disease Surveillance *Triangulation Identification for Genetic Evaluation of Risks Ranga Sampath David Ecker.
DNA / protein sequence analysis 第九組成員: 吳宇軒 侯卜夫 朱子豪 王俊偉
Introducing Bioinformatics Using the Nitrogen Cycle Alyssa Bumbaugh Ron Peck Mark Radosevich.
Research Paper on BioInformatics
CZ5226: Advanced Bioinformatics Lecture 3: MHC Molecules Prof
CZ3253: Computer Aided Drug design Introduction about the module Prof
Drug Discovery and Zika Virus
LSM3241: Bioinformatics and Biocomputing Lecture 4: Sequence analysis methods revisited Prof. Chen Yu Zong Tel:
There are four levels of structure in proteins
Strategies for annotation of a genome
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Identify D. melanogaster ortholog
Meatgenome Analysis Project Bioinformatics 301
How to search NCBI.
Presentation transcript:

LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel: Room 07-24, level 7, SOC1, National University of Singapore

2 Resource of Viral Genomes NCBI Genome Database

3 Resource of Viral Genomes NCBI Genome Database 2,226 entries of viral genomes (1,524 distinct virus strains) in the database. Early 2005 figure: 1,250 entries and 1,022 distinct 1,193 entries of complete viral genome. Early 2005 figure: 900

4 Resource of Viral Genomes NCBI Genome Database entries of coronavirus genomes (8 in early 2005) 16 entries of influenza H5N1 genomes

5 Resource of Viral Genomes NCBI Genome Database Information of viral genomes in the database can also be retrieved by clicking the viruses link: Click Here

6 Resource of Viral Genomes NCBI Genome Database List of viral genomes: (1,927 entries in Jan 2006, 1,461 in Jan 2005)

7 Resource of Viral Genomes NCBI Genome Database Viral taxonomy groups:

8 Resource of Viral Genomes NCBI Genome Database Viral genome list:

9 Resource of Viral Genomes Viral genome list:

10 Bioinformatics of Viral Genomes Viral name link: Viral genome link All entries

11 Bioinformatics of Viral Genomes Viral protein link: Limit to title search

12 Bioinformatics of Viral Genomes SARS coronavirus PP1ab PID link. It gives multiple entries from difference strains or from related species Viral strain

13 Different strains of SARS coronavirus

14 Bioinformatics of Viral Genomes Note: Viral polyprotein is not a single protein, it is a combination of several proteins. Information about these proteins can be difficult to read Suggestion: Looking into a latest NCBI entry of the same virus from a reputable research group

15 Bioinformatics of Viral Genomes SARS coronavirus unknown sars3a PID link:

16 Bioinformatics of Viral Genomes Alternative way to find SARS coronavirus genome. Look for the latest entry with complete genome and good functional annotation. Not all entries have these.

17 Bioinformatics of Viral Genomes The latest good entry: AY civet020 SARS coronavirus (In Jan 2005 AY SARS coronavirus FRA), complete genome

18 SARS Coronavirus Genome You are expected to find the info about each gene (genome location, sequence, function)

19 Function of SARS Coronavirus Genes

20 Bioinformatics of Viral Genomes Where to find the proteins in the genome entry? Source 1: mat_peptide Protein name

21 Bioinformatics of Viral Genomes Where to find the proteins in the genome entry? Source 1: mat_peptide

22 Bioinformatics of Viral Genomes Where to find the proteins in the genome entry? Putative 3C-like protease mat_peptide link: Protein name Protein function

23 Bioinformatics of Viral Genomes Where to find the proteins in the genome entry? Source 2: CDS Protein name

24 Bioinformatics of Viral Genomes Where to find the proteins in the genome entry? Source 2: CDS

25 Bioinformatics of Viral Genomes Where to find the proteins in the genome entry? Source 2: CDS

26 Bioinformatics of Viral Genomes Where to find the proteins in the genome entry? Source 2: CDS

27 Bioinformatics of Viral Genomes Where to find the proteins in the genome entry? Nucleocapsid protein protein_id link: Protein name

28 Bioinformatics of Viral Genomes How to find the name or function of a putative protein in a genome? Medline keyword search Google search

29 Bioinformatics of Viral Genomes What if the function of a putative protein is unknown? Sequence alignment (BLAST, PSI-BLAST). This will be further discussed in lecture 4. Motif analysis (Conduct a PROSITE motif search)PROSITE If sequence analysis fails or in doubt, try machine learning method (SVMProt, Nucleic Acids Res., 31: ; ProtFun, Bioinformatics, 19: ). This will be studied in lecture 5.SVMProt ProtFun

30 Bioinformatics of Viral Genomes Drug design: Step 1: Finding the right target in the genome A key protein involved in viral cycle (stop the disease process) Different from human proteins (reduce side-effects) Step 2: Finding or making a chemical agent to stop the protein In majority of cases: protein inhibitors Step 3: Test and clinical trials

31 Bioinformatics of Viral Genomes SARS Drug design: The target: 3C like protease

32 Bioinformatics of Viral Genomes SARS Drug design: Inhibitor design: Finding inhibitors of similar proteins, such as those of the same name (3C like proteases or 3C proteases of other species), may offer clues to inhibitor design. Search from NCBINCBI

33 Bioinformatics of Viral Genomes Search from NCBI finds 19 references.NCBI

34 Bioinformatics of Viral Genomes Check each abstract to find the name of one or more inhibitors. Be prepared to read the full paper to find inhibitors

35 Bioinformatics of Viral Genomes Make sure the paper talks about the inhibitors of the right protein. This one actually talks about inhibitors of protease family, thus may not necessarily be suitable for SARS 3C like protease

36 Bioinformatics of Viral Genomes SARS Drug design: Inhibitor design: Finding inhibitors of similar proteins, such as those of the same name (3C like proteases or 3C proteases of other species), may offer clues to inhibitor design. Search from GoogleGoogle

37 Bioinformatics of Viral Genomes Search from Google finds numerous entriesGoogle

38 Bioinformatics of Viral Genomes Check each entry to find the name of one or more inhibitors. Be prepared to read the full paper to find inhibitors

39 Bioinformatics of Viral Genomes Design of SARS 3C like protease inhibitors using rhinovirus 3C like protease inhibitors as templates

40 Summary of Today’s lecture Genome database at NCBI Viral genomes –SARS coronavirus genome as an example Finding proteins from a genome Therapeutic target identification from a genome and inhibitor design