Ranjit Ganta, Raj Acharya, Shruthi Prabhakara Department of Computer Science and Engineering, Penn State University DATA WAREHOUSE FOR BIO-GEO HEALTH CARE.

Slides:



Advertisements
Similar presentations
Molecular Systems Biology 3; Article number 140; doi: /msb
Advertisements

A gene expression analysis system for medical diagnosis D. Maroulis, D. Iakovidis, S. Karkanis, I. Flaounas D. Maroulis, D. Iakovidis, S. Karkanis, I.
PREDetector : Prokaryotic Regulatory Element Detector Samuel Hiard 1, Sébastien Rigali 2, Séverine Colson 2, Raphaël Marée 1 and Louis Wehenkel 1 1 Bioinformatics.
13:10:58 A New Tool for Mapping Microarray Data onto the Gene Ontology Structure ( Abstract e GOn (explore Gene Ontology) is a.
1 Harvard Medical School Mapping Transcription Mechanisms from Multimodal Genomic Data Hsun-Hsien Chang, Michael McGeachie, and Marco F. Ramoni Children.
A Multi-PCA Approach to Glycan Biomarker Discovery using Mass Spectrometry Profile Data Anoop Mayampurath, Chuan-Yih Yu Info-690 (Glycoinformatics) Final.
Bioinformatics Motif Detection Revised 27/10/06. Overview Introduction Multiple Alignments Multiple alignment based on HMM Motif Finding –Motif representation.
Microarray technology and analysis of gene expression data Hillevi Lindroos.
Overview of Biomedical Informatics Rakesh Nagarajan.
Genome-wide prediction and characterization of interactions between transcription factors in S. cerevisiae Speaker: Chunhui Cai.
Gene Co-expression Network Analysis BMI 730 Kun Huang Department of Biomedical Informatics Ohio State University.
4 th NETTAB Workshop Camerino, 5 th -7 th September 2004 Alberto Bertoni, Raffaella Folgieri, Giorgio Valentini
Introduction to Genomics, Bioinformatics & Proteomics Brian Rybarczyk, PhD PMABS Department of Biology University of North Carolina Chapel Hill.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
MOPAC: Motif-finding by Preprocessing and Agglomerative Clustering from Microarrays Thomas R. Ioerger 1 Ganesh Rajagopalan 1 Debby Siegele 2 1 Department.
An analysis of “Alignments anchored on genomic landmarks can aid in the identification of regulatory elements” by Kannan Tharakaraman et al. Sarah Aerni.
Microarray analysis 2 Golan Yona. 2) Analysis of co-expression Search for similarly expressed genes experiment1 experiment2 experiment3 ……….. Gene i:
ICA-based Clustering of Genes from Microarray Expression Data Su-In Lee 1, Serafim Batzoglou 2 1 Department.
Why microarrays in a bioinformatics class? Design of chips Quantitation of signals Integration of the data Extraction of groups of genes with linked expression.
Rashid Kaveh*, Benoit Van Aken Department of Civil and Environmental Engineering, Temple University, Philadelphia, PA Objectives Conclusion.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Health and CS Philip Chan. DNA, Genes, Proteins What is the relationship among DNA Genes Proteins ?
1 Harvard Medical School Transcriptional Diagnosis by Bayesian Network Hsun-Hsien Chang and Marco F. Ramoni Children’s Hospital Informatics Program Harvard-MIT.
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
MATISSE - Modular Analysis for Topology of Interactions and Similarity SEts Igor Ulitsky and Ron Shamir Identification.
Transcription Factor Binding Motifs, Chromosome mapping and Gene Ontology analysis on Cross-platform microarray data from bladder cancer. Apostolos Zaravinos.
Analysis and Management of Microarray Data Dr G. P. S. Raghava.
From motif search to gene expression analysis
DNA microarray technology allows an individual to rapidly and quantitatively measure the expression levels of thousands of genes in a biological sample.
Clustering of DNA Microarray Data Michael Slifker CIS 526.
The virochip (UCSF) is a spotted microarray. Hybridization of a clinical RNA (cDNA) sample can identify specific viral expression.
ONLINE BIOMARKER VALIDATION OF SURVIVAL- ASSOCIATED BIOMARKERS IN BREAST AND OVARIAN CANCER USING MICROARRAY DATA OF 3,862 4,323 PATIENTS Balázs Győrffy.
Detecting binding sites for transcription factors by correlating sequence data with expression. Erik Aurell Adam Ameur Jakub Orzechowski Westholm in collaboration.
Identification of Regulatory Binding Sites Using Minimum Spanning Trees Pacific Symposium on Biocomputing, pp , 2003 Reporter: Chu-Ting Tseng Advisor:
©Edited by Mingrui Zhang, CS Department, Winona State University, 2008 Identifying Lung Cancer Risks.
Identification of Cancer-Specific Motifs in
Supplementary Table 1 – Details of clinical prostate samples used for miRNA microarray analysis. TURP: Transurethral resection of prostate, LRP: laparoscopic.
Microarray data analysis David A. McClellan, Ph.D. Introduction to Bioinformatics Brigham Young University Dept. Integrative Biology.
Apostolos Zaravinos and Constantinos C Deltas Molecular Medicine Research Center and Laboratory of Molecular and Medical Genetics, Department of Biological.
A Short Overview of Microarrays Tex Thompson Spring 2005.
Acknowledgements Contact Information Anthony Wong, MTech 1, Senthil K. Nachimuthu, MD 1, Peter J. Haug, MD 1,2 Patterns and Rules  Vital signs medoids.
Algorithms for Biological Networks Prof. Tijana Milenković Computer Science and Engineering University of Notre Dame Fall 2010.
Lecture 8. Functional Genomics: Gene Expression Profiling using DNA microarrays. Part II Clark EA, Golub TR, Lander ES, Hynes RO.(2000) Genomic analysis.
MEME homework: probability of finding GAGTCA at a given position in the yeast genome, based on a background model of A = 0.3, T = 0.3, G = 0.2, C = 0.2.
Data Mining the Yeast Genome Expression and Sequence Data Alvis Brazma European Bioinformatics Institute.
Detecting binding sites for transcription factors by correlating sequence data with expression. Erik Aurell Adam Ameur Jakub Orzechowski Westholm in collaboration.
Telomerase, Immortalization and Cancer Eric Bankaitis Cancer Bio 169 March 9, 2006 Fig.[9]
341- INTRODUCTION TO BIOINFORMATICS Overview of the Course Material 1.
A comparative study of survival models for breast cancer prognostication based on microarray data: a single gene beat them all? B. Haibe-Kains, C. Desmedt,
Identification of co-expression networks by comparison of a multitude of different functional states of genome activity Marc Bonin 1, Stephan Flemming.
Prof. Yechiam Yemini (YY) Computer Science Department Columbia University (c)Copyrights; Yechiam Yemini; Lecture 2: Introduction to Paradigms 2.3.
GeWorkbench Overview Support Team Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute of MIT and Harvard.
Establishment and verification of miRNAs expression molecules for breast cancer and its distant metastasis Yanhong Gao Associate Professor Department of.
Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring T.R. Golub et al., Science 286, 531 (1999)
Pan-cancer analysis of prognostic genes Jordan Anaya Omnes Res, In this study I have used publicly available clinical and.
Molecular characterization of the DYX1C1 gene and its application as a cancer biomarker Yun-Ji Kim 1 *, Jae-Won Huh 1,2 *, Dae-Soo Kim 3, Min-In Bae 1,
Gene Expression Analysis Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics.
Case Study: Characterizing Diseased States from Expression/Regulation Data Tuck et al., BMC Bioinformatics, 2006.
Finding Motifs Vasileios Hatzivassiloglou University of Texas at Dallas.
TUMOR BURDEN ANALYSIS ON CT BY AUTOMATED LIVER AND TUMOR SEGMENTATION RAMSHEEJA.RR Roll : No 19 Guide SREERAJ.R ( Head Of Department, CSE)
INTRODUCTION & OBJECTIVES Introduction: The carcinogenesis of hepatocellular carcinoma (HCC) is a multifactorial, multistep and complex process. Its prognosis.
San Antonio Breast Cancer Symposium – December 6-10, 2016
ADAPTIVE DATA ANONYMIZATION AGAINST INFORMATION FUSION BASED PRIVACY ATTACKS ON ENTERPRISE DATA Srivatsava Ranjit Ganta, Shruthi Prabhakara, Raj Acharya.
Gene expression.
Ashwani Kumar and Tiratha Raj Singh*
Molecular Classification of Cancer
Global approach to the diagnosis of leukemia using gene expression profiling by Torsten Haferlach, Alexander Kohlmann, Susanne Schnittger, Martin Dugas,
Self-organizing map numeric vectors and sequence motifs
Figure 1. Identification of three tumour molecular subtypes in CIT and TCGA cohorts. We used CIT multi-omics data ( Figure 1. Identification of.
EN1 expression in breast cancer and clinical outcome.
Presentation transcript:

Ranjit Ganta, Raj Acharya, Shruthi Prabhakara Department of Computer Science and Engineering, Penn State University DATA WAREHOUSE FOR BIO-GEO HEALTH CARE INFORMATICS Health-Care records: ‘Bio-geo’ Informatics Patient identification information Geographical Information. Clinical Information:  Organ/Cellular level: Tumor, pathology.  Molecular level: DNA sequence, Microarray.  Laboratory data: Blood tests, diagnosis, prognosis. INTRODUCTION Integration of Health-care records: Privacy Violation Distributed integration of health care records. Integration within Health-care records: Information Fusion: Combine multiple disparate sources of information such that the whole is more than the sum of it’s parts.  For the patient demographic data set this helps answer questions such as:  Which age/race profile(s) if any, define a typical profile of a prostate cancer patient?  Are middle-aged Caucasian males more prone to prostate cancer than Caucasians of other age groups?  Is there a close association between age and race groups? CORRESPONDENCE ANALYSIS Sample Result: Example Data : Dhanasekharan et al. "Delineation of prognostic biomarkers in prostate cancer", Letters to Nature, Vol 412, August 2001, pages Supplementary data (Fig 1C, pg 823,Commercial Pool) Gene expression (microarray data) in four clinical states of prostate-derived tissues CLINICAL STATES Benign states BPH : Benign Prostatic Hyperlasia NAP : Normal Adjacent Prostate Malignant states PCA : Localized prostate cancer MET : Metastatic sample Sample Result: KL-CLUSTERING Genes To Co-regulated genes Down-regulated {g1} Up-regulated {g2, g4}; {g3}; {g5} No change {g6} Clusters Input Profiles g1 g2 g3 g4 g5 g6 The Kullback-Leibler (KL) divergence measures the relative dissimilarity of the shapes of two gene profiles. 1-D SOM algorithm + KL Minimize D(Gene || SOM weight for each node) at each iteration step. [Bioinformatics, Vol. 19, No. 4, 2003, ] Common Motifs  Motif: short segments of DNA that act as a binding site for a specific transcription factor  Typically 6-25bp in length  Statistically different in composite compared to the background  Often repeated within a sequence Motif 1Motif 2…Motif k Gene Gene … … Gene n30 0 Frequency of occurrence COMBINED CLUSTERING Clustering using more than one data source aims at identifying clusters of genes with similar properties among all data. Goal of combined clustering is to answer the following: 1.If genes have similar expression profile patterns, do they also share common motifs? 2.If genes have a set of motifs in common, do they also exhibit similar expression profile patterns? 3.Which genes share BOTH - that is, they have similar expression profile patterns AND share a set of common motifs? Alpha Factor Experiments Cluster on Motif vectorsCluster on Gene expression Combined clustering All genes in the cluster share the Transcription Factor MCBa CONCLUSION Figure: Information Fusion Based Attack Prototype for Bio-geo Data Warehouse Gene Expression Clinical and Pathology Public Data (Literature etc) Patient Information Global Statistics Information Fusion based Clustering Cancer Research Grid Cancer Analysis Applications Result Visualization Geographical Information We have demonstrated the significance of information fusion based tools for bio-geo health care informatics. As a data warehouse for various data sets involved in bio-geo health care informatics studies. To provide and demonstrate a set of information fusion tools for disease research.