We are developing a web database for plant comparative genomics, named Phytome, that, when complete, will integrate organismal phylogenies, genetic maps.

Slides:



Advertisements
Similar presentations
Creating NCBI The late Senator Claude Pepper recognized the importance of computerized information processing methods for the conduct of biomedical research.
Advertisements

Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Finding regulatory modules from local alignment - Department of Computer Science & Helsinki Institute of Information Technology HIIT University of Helsinki.
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
BIOINFORMATICS Ency Lee.
Genomics of Water Use Efficiency Advisory Committee Meeting Nov 2003 Comparative mapping –FISH software and related computational methods –Application.
Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
Sequence Analysis MUPGRET June workshops. Today What can you do with the sequence? What can you do with the ESTs? The case of SNP and Indel.
Bioinformatics and Phylogenetic Analysis
Kate Milova MolGen retreat March 24, Microarray experiments: Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Biological Databases Notes adapted from lecture notes of Dr. Larry Hunter at the University of Colorado.
Introduction to Genomics, Bioinformatics & Proteomics Brian Rybarczyk, PhD PMABS Department of Biology University of North Carolina Chapel Hill.
Molecular Evidence Using DNA, RNA or Protein Sequences to Classify Organisms.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Algorithm Animation for Bioinformatics Algorithms.
Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to.
Fast identification and statistical evaluation of segmental homologies in comparative maps Peter Calabrese 1, Sugata Chakravarty 2 and Todd Vision 3 1.
8/22/03 CS RA fair Comparative genome mapping Todd Vision Department of Biology University of North Carolina at Chapel Hill.
Plant genomes: phenotypes evolving by new rules Todd J. Vision Department of Biology University of North Carolina at Chapel Hill.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
The Sorcerer II Global ocean sampling expedition Katrine Lekang Global Ocean Sampling project (GOS) Global Ocean Sampling project (GOS) CAMERA CAMERA METAREP.
Comparative Genomics of Viruses: VirGen as a case study Dr. Urmila Kulkarni-Kale Bioinformatics Centre University of Pune Pune
RDA Wheat Data Interoperability Working Group Outcomes RDA Outputs P5 9 th March 2015, San Diego.
Influenza Research Database (IRD): A Web-based Resource for Influenza Virus Data and Analysis Victoria Hunt 1 *, R. Burke Squires 1, Jyothi Noronha 1,
Metagenomic Analysis Using MEGAN4
Tae-Hyung Kim 1 Gil-Mi Ryu 1,2 InSong Koh 2 Jong Park 3 1.
Title: GeneWiz browser: An Interactive Tool for Visualizing Sequenced Chromosomes By Peter F. Hallin, Hans-Henrik Stærfeldt, Eva Rotenberg, Tim T. Binnewies,
Genome Annotation using MAKER-P at iPlant Collaboration with Mark Yandell Lab (University of Utah) iPlant: Josh Stein (CSHL) Matt Vaughn.
Gramene Objectives Develop a database and tools to store, visualize and analyze data on genetics, genomics, proteomics, and biochemistry of grass plants.
What is Genetic Research?. Genetic Research Deals with Inherited Traits DNA Isolation Use bioinformatics to Research differences in DNA Genetic researchers.
Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
BioPerf: A Benchmark Suite to Evaluate High- Performance Computer Architecture on Bioinformatics Applications David A. Bader, Yue Li Tao Li Vipin Sachdeva.
ARE THESE ALL BEARS? WHICH ONES ARE MORE CLOSELY RELATED?
Gramene Objectives Provide researchers working on grasses and plants in general with a bird’s eye view of the grass genomes and their organization. Work.
A Tutorial of Sequence Matching in Oracle Haifeng Ji* and Gang Qian** * Oklahoma City Community College ** University of Central Oklahoma.
Digesting the Genome Glut Promoting the Use and Extension of GMOD To Emerging Model Organisms David Clements 1 Brian Osborne 2 Hilmar Lapp 1 Xianhua Liu.
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
Gramene: Interactions with NSF Project on Molecular and Functional Diversity in the Maize Genome Maize PIs (Doebley, Buckler, Fulton, Gaut, Goodman, Holland,
Introduction to Bioinformatics Dr. Rybarczyk, PhD University of North Carolina-Chapel Hill
EMBOSS over a Grid 1. 1st EELA Grid School December 4th of 2006 Eduardo MURRIETA LEON Romualdo ZAYAS-LAGUNAS Pierre-Alain BRANGER Jérôme VERLEYEN Roberto.
EB3233 Bioinformatics Introduction to Bioinformatics.
Generic Database. What should a genome database do? Search Browse Collect Download results Multiple format Genome Browser Information Genomic Proteomic.
Bioinformatics and Computational Biology
ARGOS (A Replicable Genome InfOrmation System) for FlyBase and wFleaBase Don Gilbert, Hardik Sheth, Vasanth Singan { gilbertd, hsheth, vsingan
Integration of Bioinformatics into Inquiry Based Learning by Kathleen Gabric.
David Wishart February 18th, 2004 Lecture 3 BLAST (c) 2004 CGDN.
An Introduction to NCBI & BLAST National Center for Biotechnology Information Richard Johnston Pasadena City College.
NCBI: something old, something new. What is NCBI? Create automated systems for knowledge about molecular biology, biochemistry, and genetics. Perform.
Biotechnology and Bioinformatics: Bioinformatics Essential Idea: Bioinformatics is the use of computers to analyze sequence data in biological research.
The Bovine Genome Database Abstract The Bovine Genome Database (BGD, facilitates the integration of bovine genomic data. BGD is.
Progress on TripalBIMS Breeding Information Management System in Tripal Sook Jung, Taein Lee, Chun-Huai Chen, Jing Yu, Ksenija Gasic, Todd Campbell, Kate.
Bioinformatics Computing 1 CMP 807 – Day 4 Kevin Galens.
Bioinformatics What is a genome? How are databases used? What is a phylogentic tree?
Bioinformatics Overview
Introduction to Bioinformatics Resources for DNA Barcoding
Biological Databases By: Komal Arora.
Pipelines for Computational Analysis (Bioinformatics)
High-throughput Biological Data The data deluge
Department of Genetics • Stanford University School of Medicine
Functional Annotation of the Horse Genome
Bioinformatics and BLAST
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Explore Evolution: Instrument for Analysis
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Presentation transcript:

We are developing a web database for plant comparative genomics, named Phytome, that, when complete, will integrate organismal phylogenies, genetic maps and gene phylogenies for over a dozen model plant species. Phytome’s aim is to help users (i) explore relationships among genes/proteins and chromosome segments within and between species and (ii) predict gene content in uncharacterized chromosomal regions. Phytome has been implemented as a relational database that currently allows users to search and retrieve protein sequences, gene families, multiple alignments and phylogenetic trees for nine species. The interface enables the user to obtain customized displays of multiple alignments and phylogenetic trees. Why a plant comparative genomics database?  Comparisons of the composition, organization, and functional components of genomes are needed to answer many different basic and applied research questions (in areas such as gene prediction, functional sequence annotation and candidate gene identification) [3].  There exist genomic maps and large sequence datasets for a wide variety of plant taxa but these data are currently uncentralized.  Many comparative genomic analyses require intensive computation and use relatively arcane computational methods. Phytome: A Plant Comparative Genomics Database Dihui Lu 1,2, Jason Phillips 3, Todd Vision 2,3 1 School of Information and Library Science, 2 Program in Bioinformatics and Computational Biology, 3 Dept. of Biology University of North Carolina at Chapel Hill Why include phylogenetic information?  Phylogenetics provides a framework to make predictions about poorly known species and genes by virtue of their relationship to better known species and genes [3].  Phylogenetic information has not yet been incorporated into major genomic database resources despite its acknowledged utility to the user community. Few phylogenetics database resources exist at all (major exceptions being TreeBase [4] and the Tree of Life Project [1]). Target Users We are designing Phytome for the following classes of users:  Plant breeders who wish to predict the possible location and function of an unknown marker or DNA/protein sequence.  Molecular biologists who are interested in knowing the relationships among members of a particular gene/protein family.  Molecular evolutionists who are interested in genome and chromosomal evolution.  Further develop tools for searching, browsing, data retrieval; data analysis/mining, and enable users to contribute content.  Refine analysis pipeline for phylogenetics and comparative mapping  Increase interconnectivity with related databases  Incorporate genomic maps together with analysis and visualization tools for comparative mapping [e.g. 2]. 1)Anonymous (2000) Assembling the Tree of Life II: Research Needs in Phyloinformatics. the Tree of Life II: Research Needs in Phyloinformatics 2)Calabrese PP, Chakravarty S, Vision TJ Fast identification and statistical evaluation of segmental homologies in comparative maps. Bioinformatics 19, i74-i80. 3)Eisen JA, Wu M, (2002) Phylogenetic analysis and gene functional predictions: phylogenomics in action. Theoretical Population Biology 61, )Piel WH, Donoghue M, Sanderson M (2000) TreeBase: A database of Phylogenetic Informaton. Proceedings of the 2nd International Workshop of Species 2000, Tsukuba, Japan. Acknowledgments We thank Dr. Brad Hemminger for useful guidance. This work is supported by NSF grant DBI to TJV Search proteins and families by sequence similarity (BLAST), keywords, database IDs.  Retrieve raw data (proteins, protein families, multiple alignments and phylogenetic trees) from Phytome for local analysis.  Dynamically display multiple alignments and phylogenetic trees for selected proteins within a family.  Cross reference Phytome proteins with other genomic DNA and EST data sources such as GenBank, TIGR, and TAIR.  An authoritative plant organismal phylogeny  Gene family information Protein-coding DNA sequences from a variety of species. Pre-computed multiple sequence alignments Pre-computed gene family phylogenies  Genetic and physical maps for diverse species (giving the locations of genes and other markers along the chromosomes), to be added in the future  Gene Ontology terms, other protein functional annotations, and database cross-references Phylogenetic tree generated using Drawtree (from PHYLIP package) for selected proteins from family number 300 (putative nucleotide sugar epimerases) We have developed a user friendly user interface for Phytome. Text search, sequence similarity search and other web applications are available for users to search individual proteins and families and to download data from Phytome. Abstract Overview Web interface Phytome data pipeline Current functionality of Phytome Data stored in Phytome Future work Dynamically generated phylogenetic tree Dynamically generated multiple alignments References Identify protein sequence matches (BLAST) Align proteins within families (CLUSTALW) Protein sequence prediction (ESTWise) Cluster proteins into families (TRIBE-MCL) Estimate phylogenies (PHYLIP) TIGR Gene Index GenBank IDs Component ESTs Gene Ontology Protein sequences Family clusters Phylogenetic trees Phytome Multiple alignments Multiple alignments generated using Jalview for selected protein from family number 234 (putative zinc finger proteins)