Protein Homology Discovery Mixed bag of proteins Protein Homologies PHD Genes Database Open reading frame finder Proteins Database BLAST Clustering Protein.

Slides:



Advertisements
Similar presentations
Question: Find the equation of a line that is parallel to the equation: 3x + 2y = 18.
Advertisements

The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
Genome analysis and annotation. Genome Annotation Which sequences code for proteins and structural RNAs ? What is the function of the predicted gene products.
Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
Research Topics Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
Sequence Analysis MUPGRET June workshops. Today What can you do with the sequence? What can you do with the ESTs? The case of SNP and Indel.
Protein RNA DNA Predicting Protein Function. Biochemical function (molecular function) What does it do? Kinase??? Ligase??? Page 245.
Kate Milova MolGen retreat March 24, Microarray experiments: Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Summary Protein design seeks to find amino acid sequences which stably fold into specific 3-D structures. Modeling the inherent flexibility of the protein.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Displaying associations, improving alignments and gene sets at UCSC Jim Kent and the UCSC Genome Bioinformatics Group.
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
Similar Sequence Similar Function Charles Yan Spring 2006.
We are developing a web database for plant comparative genomics, named Phytome, that, when complete, will integrate organismal phylogenies, genetic maps.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
BLOSUM Information Resources Algorithms in Computational Biology Spring 2006 Created by Itai Sharon.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
A Study of GeneWise with the Drosophila Adh Region Asta Gindulyte CMSC 838 Presentation Authors: Yi Mo, Moira Regelson, and Mike Sievers Paracel Inc.,
Comparative Genomics of the Eukaryotes
Basic Introduction of BLAST Jundi Wang School of Computing CSC691 09/08/2013.
Chapter 14 Genomes and Genomics. Sequencing DNA dideoxy (Sanger) method ddGTP ddATP ddTTP ddCTP 5’TAATGTACG TAATGTAC TAATGTA TAATGT TAATG TAAT TAA TA.
Gene Expression Data Qifang Xu. Outline cDNA Microarray Technology cDNA Microarray Technology Data Representation Data Representation Statistical Analysis.
NCBI Review Concepts Chuong Huynh. NCBI Pairwise Sequence Alignments Purpose: identification of sequences with significant similarity to (a)
Discover the UniProt Blast tool. Murcia, February, 2011Protein Sequence Databases Customize the BLAST results.
CISC667, F05, Lec9, Liao CISC 667 Intro to Bioinformatics (Fall 2005) Sequence Database search Heuristic algorithms –FASTA –BLAST –PSI-BLAST.
Discovering the Correlation Between Evolutionary Genomics and Protein-Protein Interaction Rezaul Kabir and Brett Thompson
Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,
Construction of Substitution Matrices
Toward a Unified Gene Page GMOD Meeting, April 2004 Don Gilbert,
Web Databases for Drosophila Introduction to FlyBase and Ensembl Database Wilson Leung6/06.
Bulk data files // TeraGrid uses for Genome Databases GMOD meet, June 2006 Don Gilbert,
Bioinformatic Tools for Comparative Genomics of Vectors Comparative Genomics.
Building WormBase database(s). SAB 2008 Wellcome Trust Sanger Insitute Cold Spring Harbor Laboratory California Institute of Technology ● RNAi ● Microarray.
Curation Tools Gary Williams Sanger Institute. SAB 2008 Gene curation – prediction software Gene prediction software is good, but not perfect. Out of.
Lettuce/Sunflower EST CGPDB project. Data analysis, assembly visualization and validation. Alexander Kozik, Brian Chan, Richard Michelmore. Department.
1 Improve Protein Disorder Prediction Using Homology Instructor: Dr. Slobodan Vucetic Student: Kang Peng.
Analysis and comparison of very large metagenomes with fast clustering and functional annotation Weizhong Li, BMC Bioinformatics 2009 Present by Chuan-Yih.
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
Functional prediction methods. The usual troubles of the molecular and cellular biology labs What are the functions of a previously non characterized.
How can we find genes? Search for them Look them up.
Large-scale Prediction of Yeast Gene Function Introduction to Bio-Informatics Winter Roi Adadi Naama Kraus
Construction of Substitution matrices
David Wishart February 18th, 2004 Lecture 3 BLAST (c) 2004 CGDN.
Gene Family Size Distributions Brought to You By Your Neighorhood Durand Lab Narayanan Raghupathy Nan Song Rose Hoberman.
Gene models and proteomes for Saccharomyces cerevisiae (Sc), Schizosaccharomyces pombe (Sp), Arabidopsis thaliana (At), Oryza sativa (Os), Drosophila melanogaster.
Pairwise Sequence Alignment Exercise 2. || || ||||| ||| || || ||||||||||||||||||| MVHLTPEEKTAVNALWGKVNVDAVGGEALGRLLVVYPWTQRFFE… ATGGTGAACCTGACCTCTGACGAGAAGACTGCCGTCCTTGCCCTGTGGAACAAGGTGGACG.
What is BLAST? Basic BLAST search What is BLAST?
Friday July 18 th Update “WesList” proteins Wes and Mark? identified a number of potential targets based upon the presence of patents – the enzyme name,
Gene Ontology TM (GO) Consortium
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Decision trees for hierarchical multi-label classification.
Bioinformatics Shared Resource Bioinformatics : How to… Bioinformatics Shared Resource Kutbuddin Doctor, PhD.
Bos taurus Olfactory Receptor Katie Davis 1,2 and Sandra Rodriguez-Zas 1 1 Department of Animal Sciences, University of Illinois Urbana-Champaign, 2 ACES.
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
Bioinformatics What is a genome? How are databases used? What is a phylogentic tree?
BLAST: Basic Local Alignment Search Tool Robert (R.J.) Sperazza BLAST is a software used to analyze genetic information It can identify existing genes.
What is BLAST? Basic BLAST search What is BLAST?
Sequencing, de novo assembling, and annotating the genome of the endangered Chinese crocodile lizard, shinisaurus crocodilurus Jian gao, qiye li, zongji.
Basics of BLAST Basic BLAST Search - What is BLAST?
Lettuce/Sunflower EST CGPDB project.
GEP Annotation Workflow
Bioinformatics and BLAST
Geometrical Data Analysis
Identify D. melanogaster ortholog
Integrating Genomic Databases
Multiple sequence alignment & Phylogenetics Analysis
Shared components define the ‘ancient’ phagosome.
Computational genomics
Basic Local Alignment Search Tool
Sequence Analysis Alan Christoffels
Presentation transcript:

Protein Homology Discovery Mixed bag of proteins Protein Homologies PHD Genes Database Open reading frame finder Proteins Database BLAST Clustering Protein Homology Database Inner components of PHD - Cylinders are used for databases (Raw information) - Blocks are used for operations performed on this information Goals and motivation for the PHD -Building a library of protein families -Useful for functional and structural prediction -Methodology applied to UniGene organisms

Inner workings of the PHD P HD BLAST ORF Genes Proteins Genes downloaded from public Unigene Db Apply an open- reading frame finder algorithm to extract the proteins Clustering Proteins Db constructed Do an all against all alignment using BLAST Clustering algorithms applied to BLAST results Final Db contains protein homologies

Who goes into the funnel?

species # of genes/proteins bovine taurus 6871 zebrafish 10,000 homo sapien 32,000 mouse From UniGene: From Stanford: Yeast 15,000

Who goes into the funnel? Drosophila 15,000 From Flybase: From Sanger institute: Nematoad 19,000

Clustering (what we have) - Graphs were constructed from binary matrices using a threshold e-value - Clustering algorithms operated on the resultant graphs to give rise to the several clusters Example on clustering

Clustering (what we will have) Hierarchical clustering

…..

Current work Parallelizing BLAST Clustering / hierarchical clustering Hierarchical Clustering Parallel Clustering All against All Future work