The Wold Lab BioHub Cory Tobin. Collaborators Brandon King Joe Roden Diane Trout Dr. Barbara.

Slides:



Advertisements
Similar presentations
1 Orthologs: Two genes, each from a different species, that descended from a single common ancestral gene Paralogs: Two or more genes, often thought of.
Advertisements

Genomic Innovations- Orthology Paralogy. Genomic innovation.
Finding regulatory modules from local alignment - Department of Computer Science & Helsinki Institute of Information Technology HIIT University of Helsinki.
Ontology annotation: mapping genomic regions biological function Paul D Thomas, Huaiyu Mi and Suzanna Lewis.
Orthology, paralogy and GO annotation Paul D. Thomas SRI International.
Basics of Comparative Genomics Dr G. P. S. Raghava.
AI and Bioinformatics From Database Mining to the Robot Scientist.
ProInt Finder to Search Protein Interactions Shwe S. Lin Mentor: Matteo Pellegrini, UCLA.
Xenolog: Homologs resulting from horizontal gene transfer.
Sequence Similarity Searching Class 4 March 2010.
16 March Identification of RNAi-Related Genes in Archaea David M. Ng BME 230.
Detecting Orthologs Using Molecular Phenotypes a case study: human and mouse Alice S Weston.
Biological Databases Notes adapted from lecture notes of Dr. Larry Hunter at the University of Colorado.
Tree Pattern Matching in Phylogenetic Trees Automatic Search for Orthologs or Paralogs in Homologous Gene Sequence Databases By: Jean-François Dufayard,
An Exploratory Method to Reconstruct Pathways Cory Tobin.
A Computational Analysis of the H Region of Mouse Olfactory Receptor Locus 28 Deanna Mendez SoCalBSI August 2004.
Subsystem Approach to Genome Annotation National Microbial Pathogen Data Resource Claudia Reich NCSA, University of Illinois, Urbana.
Arabidopsis Gene Project GK-12 April Workshop Karolyn Giang and Dr. Mulligan.
Genome database & information system for Daphnia Don Gilbert, October 2002 Talk doc at
BTN323: INTRODUCTION TO BIOLOGICAL DATABASES Day2: Specialized Databases Lecturer: Junaid Gamieldien, PhD
Aequatus Browser, an open-source web-based tool developed at TGAC to visualise homologous gene structures among differing species or subtypes of a common.
Comparative Genomics of the Eukaryotes
Sequence Analysis Alignments dot-plots scoring scheme Substitution matrices Search algorithms (BLAST)
Title: GeneWiz browser: An Interactive Tool for Visualizing Sequenced Chromosomes By Peter F. Hallin, Hans-Henrik Stærfeldt, Eva Rotenberg, Tim T. Binnewies,
MCB 5472 Assignment #5: RBH Orthologs and PSI-BLAST February 19, 2014.
Supporting High- Performance Data Processing on Flat-Files Xuan Zhang Gagan Agrawal Ohio State University.
ANALYSIS AND VISUALIZATION OF SINGLE COPY ORTHOLOGS IN ARABIDOPSIS, LETTUCE, SUNFLOWER AND OTHER PLANT SPECIES. Alexander Kozik and Richard W. Michelmore.
Browsing the Genome Using Genome Browsers to Visualize and Mine Data.
You have worked for 2 years to isolate a gene involved in axon guidance. You sequence the cDNA clone that contains axon guidance activity. What do you.
Web Databases for Drosophila Introduction to FlyBase and Ensembl Database Wilson Leung6/06.
A Tutorial of Sequence Matching in Oracle Haifeng Ji* and Gang Qian** * Oklahoma City Community College ** University of Central Oklahoma.
Protein and RNA Families
Identification of Ortholog Groups by OrthoMCL Protein sequences from organisms of interest All-against-all BLASTP Between Species: Reciprocal best similarity.
A bioinformatics simulation of a mutant workup from a model genetic organism Christopher J. Harendza – Montgomery County Community College.
Introduction to Bioinformatics Dr. Rybarczyk, PhD University of North Carolina-Chapel Hill
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
Implementation of a Relational Database as an Aid to Automatic Target Recognition Christopher C. Frost Computer Science Mentor: Steven Vanstone.
The evolution of the immune system in chicken and higher Organon, Oss Tim Hulsen.
Chapter 21 Genomes and Their Evolution. Genomics ______________ is a new approach to biology concerned with the study of the ___________ set of __________.
Data Integration & Data Mining Tool Donald Dunbar BHF CoRE Bioinformatics Team Edinburgh Bioinformatics Meeting April 2013.
Tweaking BLAST Although you normally see BLAST as a web page with boxes to place data in and tick boxes, etc., it is actually a command line program that.
The Future of Genetics Research Lesson 7. Human Genome Project 13 year project to sequence human genome and other species (fruit fly, mice yeast, nematodes,
You have worked for 2 years to isolate a gene involved in axon guidance. You sequence the cDNA clone that contains axon guidance activity. The sequence.
SNP Comparison Group Members Amira Jhelum Rahul Shweta.
What is BLAST? Basic BLAST search What is BLAST?
Bioinformatics What is a genome? How are databases used? What is a phylogentic tree?
BLAST: Basic Local Alignment Search Tool Robert (R.J.) Sperazza BLAST is a software used to analyze genetic information It can identify existing genes.
What is BLAST? Basic BLAST search What is BLAST?
Basics of BLAST Basic BLAST Search - What is BLAST?
Basics of Comparative Genomics
Saccharomyces Genome Database (SGD)
Using BLAST to Identify Species from Proteins
Overview Bioinformatics: Analyzing biological data using statistics, math modeling, and computer science BLAST = Basic Local Alignment Search Tool Input.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Ensembl Genome Repository.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Identify D. melanogaster ortholog
Comparative Genomics.
What do you with a whole genome sequence?
BSC1010: Intro to Biology I K. Maltz Chapter 21.
Pairwise Sequence Alignment
Basics of Comparative Genomics
Gene Safari (Biological Databases)
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Supporting High-Performance Data Processing on Flat-Files
Basic Local Alignment Search Tool
Using BLAST to Identify Species from Proteins
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Presentation transcript:

The Wold Lab BioHub Cory Tobin

Collaborators Brandon King Joe Roden Diane Trout Dr. Barbara

Goal Standardize the relationship between biological data Integrate all of the data seamlessly Provide novel methods to search for and analyze data

Adapted from

My Contribution Implement a database for homology data

Background Species A Species B Paralogs Orthologs The more general term is “homology” Gene

Requirements Be more accurate and flexible than HomoloGene Work in real time Make sense of HomoloGene’s misleading data

Rationale Gene They are similar Gene HomoloGene BioHub They are related like this

Rationale Continued Human Genome Mouse Genome Seq ASeq B HomoloGene would BLAST seq A against mouse and determine that seq C is an ortholog of seq A. Seq C HomoloGene would also BLAST seq B against mouse and detrmine that seq C is an ortholog of seq B. BioHub will BLAST seq A against mouse, find seq C, then BLAST C back against human to see if there are any better matches. It will find seq B to be better.

Methods Design data relationships that make sense biologically Generate the low-level database interaction code Parse and load HomoloGene’s data into our database Write biologically useful functions Create a web-based interface for easy use

Materials ArgoUML – Design Aid Pymerase – Design Implementation PostgreSQL – Database HomoloGene – Data Source Python – Programming Language

Current State Design data relationships that make sense biologically Generate the low-level database interaction code Parse and load HomoloGene’s data into our database Write biologically useful functions Create a web-based interface for easy use

Example Usage Sequence of Interest …GGATACAAAATTCCTC… Are there any known genes in this sequence? acetyl - coenzyme A dehydrogenase ( Human ) (cont.)

acetyl - coenzyme A dehydrogenase ( Human ) Are there any homologs? Mouse Rat Mosquito Fruit fly Nematode (cont.)

How are those genes related?

Where do you want to go?

More Info BioHubwoldlab.caltech.edu / biohub HomoloGenewww.ncbi.nlm.nih.gov Pythonpython.org Pymerasepymerase.sf.net PostgreSQLpostgresql.org