Functional Genomics Consortium: NIDDK (Kaestner) and (Permutt)

Slides:



Advertisements
Similar presentations
The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania.
Advertisements

Introduction to BioConductor Friday 23th nov 2007 Ståle Nygård Statistical methods and bioinformatics for the analysis of microarray.
Gene Ontology John Pinney
Integrated Data Systems for Genomic Analysis Genomics and Bioinformatics for the Advancement of Clinical Sciences Thomas Jefferson University, Oct. 14,
The MGED Ontology Is An Experimental Ontology Bio-Ontologies Aug 8, 2002 Chris Stoeckert, Helen Parkinson and the MGED Ontology Working Group.
Kate Milova MolGen retreat March 24, Microarray experiments: Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
August 29, 2002InforMax Confidential1 Vector PathBlazer Product Overview.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
The MGED Ontology: A framework for describing functional genomics experiments SOFG Nov. 19, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Genome database & information system for Daphnia Don Gilbert, October 2002 Talk doc at
GUS Overview June 18, GUS-3.0 Supports application and data integration Uses an extensible architecture. Is object-oriented even though it uses.
Microrray Data Standardisation Microarray Gene Expression Database group -- MGED December, 2000.
Data Curation and Management activities within the UCT Computational Biology Group Dr Nicky Mulder.
Support for MAGE-TAB in caArray 2.0 Overview and feedback MAGE-TAB Workshop January 24, 2008.
GUS The Genomics Unified Schema A Platform for Genomics Databases V. Babenko, B. Brunk, J.Crabtree, S. Diskin, S. Fischer, G. Grant, Y. Kondrahkin, L.Li,
Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of.
Rahul Raman, Ram Sasisekharan Bioinformatics Core Massachusetts Institute of Technology Glue Grants Bioinformatics Meeting April 22-23, 2004 San Diego,
GUS: A Functional Genomics Data Management System Chris Stoeckert, Ph.D. Center for Bioinformatics and Dept. of Genetics University of Pennsylvania ASM.
Abstract BarleyBase is a USDA-funded public repository for plant microarray data. BarleyBase houses raw and normalized expression data from the 22K Affymetrix.
First GUS Workshop July 6-8, 2005 Penn Center for Bioinformatics Philadelphia, PA.
1 MIAME The MIAME website: © 2002 Norman Morrison for Manchester Bioinformatics.
Copyright OpenHelix. No use or reproduction without express written consent1.
A web resource of the Beta Cell Biology Consortium EPConDB
Copyright OpenHelix. No use or reproduction without express written consent1.
1 maxdLoad The maxd website: © 2002 Norman Morrison for Manchester Bioinformatics.
Content, Format, and Standards in Genomics Scale Data The ILSI – EBI Collaboration Wm. B. Mattes, PhD, DABT.
What is an Ontology? An ontology is a specification of a conceptualization that is designed for reuse across multiple applications and implementations.
Browsing the Genome Using Genome Browsers to Visualize and Mine Data.
1 Transcript modeling Brent lab. 2 Overview Of Entertainment  Gene prediction Jeltje van Baren  Improving gene prediction with tiling arrays Aaron Tenney.
MIAMExpress and the development of annotation ontologies for gene expression experiments Ele Holloway Microarray Informatics European Bioinformatics Institute.
Gramene Objectives Provide researchers working on grasses and plants in general with a bird’s eye view of the grass genomes and their organization. Work.
RADical microarray data: standards, databases, and analysis Chris Stoeckert, Ph.D. University of Pennsylvania Yale Microarray Data Analysis Workshop December.
The EST database is a collection of short single-read transcript sequences from GenBank. These sequences provide a resource to evaluate gene expression,
Protein and RNA Families
Alvis Brazma, Johan Rung, Ugis Sarkans, Thomas Schlitt, Jaak Vilo European Bioinformatics Institute (EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge,
Generating Useful Information in Toxicogenomics: Focused Efforts: Microarray Standards Feb. 6, 2003, The National Academies Chris Stoeckert, Ph.D. Center.
Ontologies Working Group Agenda MGED3 1.Goals for working group. 2.Primer on ontologies 3.Working group progress 4.Example sample descriptions from different.
Mining the Biomedical Research Literature Ken Baclawski.
Web Technologies for Bioinformatics Ken Baclawski.
The Penn Experience with MAGE-TAB John Brestelli Elisabetta Manduchi Junmin Liu Jonathan Schug Chris Stoeckert NCI MAGE Workshop Jan 24, 2008.
ArrayExpress Ugis Sarkans EMBL - EBI
Expression Data Integration Microarray Gene Expression Database Meeting Sunday 14th November 1999.
GEO (Gene Expression Omnibus) Deepak Sambhara Georgia Institute of Technology 21 June, 2006.
GUS We have created the Genomic Unified Schema (GUS), a relational database that warehouses and integrates biological sequence, sequence annotation, and.
Transcriptomics on Bio-Linux
Director’s Challenge IT Overview
Using ArrayExpress.
EPConDB: Endocrine Pancreas Consortium Database
Department of Genetics • Stanford University School of Medicine
Pick a Gene Assignment 4 Requirements
Gene Expression Analysis and Proteins
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
What is an Ontology An ontology is a set of terms, relationships and definitions that capture the knowledge of a certain domain. (common ontology ≠ common.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Ensembl Genome Repository.
EXTENDING GENE ANNOTATION WITH GENE EXPRESSION
Rationale for GUS Answer queries:
Current and Future Directions
Information Management Infrastructure for the Systematic Annotation of Vertebrate Genomes V Babenko (1), B Brunk (1), J Crabtree (1), S Diskin (1), Y Kondrahkin.
RAD (RNA Abundance Database)
The Computational Biology and Informatics Laboratory
From EpoDB to EPConDB: Adventures in Gene Expression Databases
Integrating Genomic Databases
Leveraging EST Sequencing, Micro Array Experiments and Database Integration for Gene Expression Analyses The Computational Biology and Informatics Laboratory.
Gene Expression Analysis
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Presentation transcript:

Functional Genomics Consortium: NIDDK 56947 (Kaestner) and 56954 (Permutt) Lab Members Marie Scearce John Brestelli Catherine Lee Athanasios Arsenlis Phuc Le Marko Vatamaniuk Collaborators: Gerard Gradwohl (Strasburg) Ihor Lemischka (Princeton) CBIL Chris Stoeckert Elisabetta Manduchi Greg Grant Joan Mazzarelli Phuc Le Angel Pizzaro Deborah F. Pinney Jonathan Crabtree Shannon McWeeney Brian Brunk Consortium Alan Permutt Hiroshi Inoue Doug Melton Sandra Clifton Deana Pape Buddy Brownstein

Functional Genomics of the Developing Endocrine Pancreas cDNA libraries from pancreatic tissue Consortium libraries (Currently > 34,000 ESTs) Pancreatic transcripts: 10,364 EST assemblies (from 25,866 mouse ESTs) Novel transcripts: 3190 assemblies with only consortium ESTs relevant dbEST libraries Microarray studies on pancreatic tissue Genome wide-survey for genes expressed Pancreas chip Validated sequences of interest Novel sequences from libraries Goal: Identify genes expressed in the developing endocrine pancreas

www.cbil.upenn.edu/EPConDB

CBIL Project Architecture PlasmoDB AllGenes EPConDB Sequence & annotation Gene index (ESTs and mRNAs) Microarray expression data experimental annotation Relational DB (Oracle) with Perl object layer GUS RAD

RAD GUS EST clustering and assembly Identify shared TF binding sites TESS (Transcription Element Search Software) Genomic alignment and comparative Sequence analysis Identify shared TF binding sites

EPConDB: Content and Features Pancreas clone sets Panc Chip Clone sets 1.0, 1.5, 2.0 Transcripts found in consortium libraries Novel transcripts discovered from consortium libraries Microarray results Using Incyte’s GEM (genome-wide survey) Using Panc Chip Genes expressed in pancreas AllGenes queries: function, chromosomal location, keyword, accession, libraries Pathways

EPConDB Pathway query

EPConDB Boolean Query

EPConDB History Query

The Future of EPConDB Display new microarray results Kaestner: labeling study, developmental series, mutant mice Central repository for pancreatic experiments 2-color cDNA, long oligos, Affymetrix, SAGE, etc. Provide tools for microarray import/export MGED: MIAME-compliance, MAGE, Ontology [Standard Annotation] John Hutton/ Ron Taylor (UCHSC) Beta Cell Biology Consortium? Starting point for analysis Provide tools: Xcluster, PaGE, Speed Normalization Package Download datasets Coordinate experiments to facilitate comparison Integrate experiments With genomic data (AllGenes), proteomics? Atlases? Build networks Goal: Comprehensive understanding of pancreatic gene expression

Microarray Analysis: Xcluster Xcluster provided by Gavin Sherlock 2001

Microarray Analysis: Data download

Microarray Gene Expression Database group (MGED) International effort on microarray data standards: Develop standards for storing and communicating microarray-based gene expression data defining the minimal information required to ensure reproducibility and verifiability of results and to facilitate data exchange (MIAME, MAGEML-MAGEDOM) collecting (and where needed creating) controlled vocabularies/ ontologies. developing standards for data comparison and normalization. The schema is compliant with the minimum annotations recommended by MGED. MIAME: Minimum Information About a Microarray Experiment (common set of concepts that need to be captured in a database to describe gene expression experiments adequately for interpretation, reproduction or critical assessment). MAML: MicroArray Mark-up Language (XML Document Type Definitions of the concepts). http://www.mged.org

Future EPConDB Query Result

Microarray Analysis: R statistics

Microarray Analysis: PaGE

Assembled Transcripts About 3 million human EST and mRNA sequences used Combined into 797,028 assemblies Cluster into 150,006 “genes” Can identify a protein for 76,771 genes And predict a function for 24,127 genes About 2 million mouse EST and mRNA sequences used Combined into 355,770 assemblies Cluster into 74,024 “genes” Can identify a protein for 34,008 genes And predict a function for 15,403 genes

AllGenes Enhancements: Annotated Entries

AllGenes Enhancements: Genomic Data

Update: Consortium Libraries so far Total Sequences: > 29,000 Mouse in DOTS: 21,196 PancreasAssemblies: 7,294 NOVEL 3,122!