BioRDF Breakout Introduction – Kei Cheung Mage-tab – Michael Miller

Slides:



Advertisements
Similar presentations
Misha Kapushesky November 28, 2003 Expression Profiler: Next Generation.
Advertisements

The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Unlocking the potential of public available gene expression data for large-scale analysis Jonatan Taminau PhD defense, November 2012.
Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○
Welcome to mini-symposium on ontologies for biological sample description EMBL-EBI Wellcome Trust Genome Campus Deceber 5, 2001.
Introduction to Genomics, Bioinformatics & Proteomics Brian Rybarczyk, PhD PMABS Department of Biology University of North Carolina Chapel Hill.
GCB/CIS 535 Microarray Topics John Tobias November 15 th, 2004.
Why microarrays in a bioinformatics class? Design of chips Quantitation of signals Integration of the data Extraction of groups of genes with linked expression.
Pathway Informatics 6 th July, 2015 Ansuman Chattopadhyay, PhD Head, Molecular Biology Information Services Health Sciences Library System University of.
Gene Expression Analysis using Microarrays Anne R. Haake, Ph.D.
ArrayExpress and Expression Atlas: Mining Functional Genomics data Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
Samples, Phenotype, Ontology Team at EBI SPOT Terry Meehan.
Computational Biology and Informatics Laboratory Development of an Application Ontology for Beta Cell Genomics Based On the Ontology for Biomedical Investigations.
DNA microarrays Each spot contains a picomole of a DNA ( moles) sequence.
The Center for Medical Genomics facilitates cutting-edge research with state-of-the-art genomic technologies for studying gene expression and genetics,
Finish up array applications Move on to proteomics Protein microarrays.
Figure 1S. BSR homology. Exhaustive pairwise alignment using neighbour-joining phylogeny analysis by Clone Manager7 software shows the high homology of.
RADical microarray data: standards, databases, and analysis Chris Stoeckert, Ph.D. University of Pennsylvania Yale Microarray Data Analysis Workshop December.
Introduction to DNA Microarrays: Functional Mining of Array Patterns Michael F. Miles, M.D., Ph.D. Depts. of Pharmacology/Toxicology and Neurology and.
ACGT: Open Grid Services for Improving Medical Knowledge Discovery Stelios G. Sfakianakis, FORTH.
Biological Networks & Systems Anne R. Haake Rhys Price Jones.
Structural Models Lecture 11. Structural Models: Introduction Structural models display relationships among entities and have a variety of uses, such.
BBN Technologies Copyright 2009 Slide 1 The S*QL Plugin for Cytoscape Visual Analytics on the Web of Linked Data Rusty (Robert J.) Bobrow Jeff Berliner,
GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic.
Bioinformatics and Computational Biology
GeWorkbench Overview Support Team Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute of MIT and Harvard.
Introduction and Applications of Microarray Databases Chen-hsiung Chan Department of Computer Science and Information Engineering National Taiwan University.
Pathway Informatics 30 th March, 2016 Ansuman Chattopadhyay, PhD Head, Molecular Biology Information Services Health Sciences Library System University.
Web Resources for Genomics Kei Cheung, Ph.D. Assistant Professor Yale Center for Medical Informatics (MBB 452a Genomics & Bioinformatics) Oct. 8, 2003.
DNA Microarray. Microarray Printing 96-well-plate (PCR Products) 384-well print-plate Microarray.
Pathway Informatics 16th August, 2017
Genomes and their evolution
Elucidating effects of nerve injury on gene expression using
A graph-based integration of multiple layers of cancer genomics data (Progress Report) Do Kyoon Kim 1.
Genetics: Analysis and Principles
1. SELECTION OF THE KEY GENE SET 2. BIOLOGICAL NETWORK SELECTION
HCLS Scientific Discourse C-SHALS 2009
KnowEnG: A SCALABLE KNOWLEDGE ENGINE FOR LARGE SCALE GENOMIC DATA
FINAL PROJECT- Key dates
Memory, Learning and BDNF gene expression
Optimizing Biological Data Integration
Making “Open Data” Work: Challenges for Data Integration in Genomics Research
Using ArrayExpress.
ArrayExpress and Gene Expression Atlas: Mining Functional Genomics data Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
How to store and visualize RNA-seq data
Harry Hochheiser Assistant Professor
Functional Genomics in Evolutionary Research
Ashwani Kumar and Tiratha Raj Singh*
 The human genome contains approximately genes.  At any given moment, each of our cells has some combination of these genes turned on & others.
What is cell differentiation?
Day 2: Session 8: Questions and follow-up…. James C. Fleet, PhD
Functional Annotation of the Horse Genome
WikiNeuron: Semantic Neuro-Mashup
Genomes and Their Evolution
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
From MIAME to MAML: Microarray Gene Expression Database (MGED)
MGED Ontology Working Group Report
Kei Cheung, Ph.D. Yale Center for Medical Informatics
ChIP-seq Robert J. Trumbly
Molecular Mechanisms Regulating the Defects in Fragile X Syndrome Neurons Derived from Human Pluripotent Stem Cells  Tomer Halevy, Christian Czech, Nissim.
Gene Expression Analysis
In situ confirmations of Sepw1-enriched genes.
A Major Role for Capsule-Independent Phagocytosis-Inhibitory Mechanisms in Mammalian Infection by Cryptococcus neoformans  Cheryl D. Chun, Jessica C.S.
Kei Cheung, Ph.D. Yale Center for Medical Informatics
Session 1: WELCOME AND INTRODUCTIONS
Comment on “Multiple repressive mechanisms in the hippocampus during memory formation” by Rebecca S. Mathew, Hillary Mullan, Jan Krzysztof Blusztajn, and.
Introduction to Bioinformatics
Presentation transcript:

BioRDF Breakout Introduction – Kei Cheung Mage-tab – Michael Miller vOID – Jun Zhao (remote) aTag – Matthias Samwald (remote) Discussion – All

BioRDF Breakout: Microarray Use Case Kei Cheung, Ph.D. Associate Professor Yale Center for Medical Informatics HCLS IG Face-to-Face Meeting, Santa Clara, California, November 2-3, 2009

Introduction Whole-genome expression profiling has created a revolution in the way we study disease and basic biology. DNA microarrays allow scientists to quantify thousands of genomic features in a single experiment Since 1997, the number of published results based on an analysis of gene expression microarray data has grown from 30 to over 5,000 publications per year Major public microarray data repositories have been created in different countries (e.g., NCBI GEO, EBI ArrayExpress, and CIBEX)

Microarray Workflow

An Example of differentially expressed genes

Importance of Integrating Microarray Data Due to the high cost and low reproducibility of many microarray experiments, it is not surprising to find a limited number of patient samples in each study, Very few common identified marker genes among different studies involving patients with the same disease. It is of great interest and challenge to merge data sets from multiple studies to increase the sample size, which may in turn increase the power of statistical inferences. The integration of external information resources is essential in interpreting intrinsic patterns and relationships in large-scale gene expression data

Microarray Data Standards MGED MIAME MAGE-ML MAGE-TAB

Some Examples Joint analysis of two microarray gene-expression data sets to select lung adenocarcinoma marker genes (Jiang et al. 2004 BMC Bioinformatics) Large-scale integration of cancer microarray data identifies a robust common cancer signature (Xu et al. 2007 BMC Bioinformatics) What about neurosciences?

Access to and Use of Microarray data in Neuroscience NIH Neuroscience Microarray Consortium Public repositories such as GEO and ArrayExpress (including data generated from neuroscience microarray experiments) Brain atlases (e.g., Allen Brain Atlas and GenSAT)

Ontology-Based Integration Microarray experiment 1 Microarray experiment 2 Brain region (e.g., entorhinal cortex, hippocampus, primary visual cortex) Layer (e.g., Layer 2 of the enthorhinal cortex) Neuron (e.g., stellate island neuron, pyramidal neuron) Part-of Neuron ontology Input to

Example Federated Queries Retrieve a list of differentially expressed genes between different brain regions (e.g., hippocampus and entorhinal cortex) for normally aged human subjects. Retrieve a list of differentially expressed genes for the same brain region of normal human subjects and AD patients. Using these lists of genes one can issue (federated) queries to retrieve additional information about the genes for various types of analyses (e.g., GO term enrichment).

Microarray Experiment Descriptions E-GEOD-3296 Transcription profiling of primary mouse embryonic fibroblasts (MEFs) from C57B1/6x129/Sv F2 e14.5 embryos that contain a deletion in the CH1 domain of three of four alleles of CBP and p300 The CH1 protein interaction domain of the transcriptional coactivators p300 and CBP is thought to interact with HIF-1alpha and this interaction is thought to be critical to the expression of HIF-1alpha target genes in response to hypoxia. Trichostatin A (TSA), an inhibitor of histone deacetylases, has been reported to repress the expression of HIF-1alpha target genes. To test the requirement of the CH1 domain and TSA for gene expression in response to dipyridyl (a hypoxia mimetic), primary mouse embryonic fibroblasts (MEFs) were generated from C57Bl/6x129/Sv F2 e14.5 embryos that contain a deletion in the CH1 domain of three of four alleles of CBP and p300. The remaining allele of p300 or CBP was a conditional knock out allele. Control MEFs with only a single conditional knockout allele of p300 or CBP were also generated. At passage 3 MEFs were infected with Cre Adenovirus and grown until they had expanded at least 100 fold. Subconfluent MEFs were treated with ethanol vehicle or 100ng/ml TSA with 5% carbon dioxide at 37 C in a humid chamber for 30 min., followed by ethanol vehicle or 100 umdipyridyl (DP) for an additional 3hrs. Immediately after treatment, cells were lysed in Trizol for RNA extraction. E-GEOD-3327 Transcription profiling of different regions of mouse brain to study adult mouse gene expression patterns in common strains. Adult mouse gene expression patterns in common strains. Experiment Overall Design: six mouse strains and seven brain regions were analyzed E-GEOD-358 Transcription profiling of rat whole brain samples from animals with repeated exposure to the anaesthetic isoflurane 12 Controls, 3 5-exposures, 3 10-exposures. Rats were exposed to 90 minutes of 1.0% isoflurane twice a day for a total of 5 or 10 exposures. Animals did not require intubation. All exposures and hybridizations were performed at the Univ. of Pennsylvania

Open Biomedical Annotator

Some Results Two microarray experiments (E-GEOD-4034, E-GEOD-4035) contain the following set of terms: fear, hippocampus, mouse. These microarray experiments study the role of hippocampus in fear using mouse as the model.

Analysis tools BioConductor GenePattern Genespring

Intercommunity collaboration HCLS (BioRDF) MGED (ArrayExpress) NIF (NeuroLex) Ontology community (NCBO)

Web of silos cel, gpr, etc

Semantic Web = Brilliant Web!

The End

Discussion What is the RDF structure Extension of SPARQL to empower data analysis Workflow and provenance Visualization How to integrate database and literature Integration of other types of data Inter-community collaboration Translational use cases

What should be the RDF structure? Experiments Samples Experimental conditions/factors Gene lists Arrays/chips Raw/processed data (e.g., CEL, GPR, gene matrix)

Extension of SPARQL Hierarchical queries Statistical analyses/tests Enrichment analysis

Workflow and provenance Taverna Biomoby Genepattern

Visualization Cytoscape TreeView

How to integrate database and literature

Inter-community Collaboration NCBO SWAN

What other types of data can be integrated with microarray data

Translational use cases