Networks and Algorithms in Bio-informatics D. Frank Hsu Fordham University *Joint work with Stuart Brown; NYU Medical School Hong Fang.

Slides:



Advertisements
Similar presentations
Basic Gene Expression Data Analysis--Clustering
Advertisements

CPL The Convergence of Bioinformatics and Medical Informatics -- PL Chang, M.D.
A gene expression analysis system for medical diagnosis D. Maroulis, D. Iakovidis, S. Karkanis, I. Flaounas D. Maroulis, D. Iakovidis, S. Karkanis, I.
Computational discovery of gene modules and regulatory networks Ziv Bar-Joseph et al (2003) Presented By: Dan Baluta.
D ISCOVERING REGULATORY AND SIGNALLING CIRCUITS IN MOLECULAR INTERACTION NETWORK Ideker Bioinformatics 2002 Presented by: Omrit Zemach April Seminar.
1 MicroArray -- Data Analysis Cecilia Hansen & Dirk Repsilber Bioinformatics - 10p, October 2001.
Bioinformatics at IU - Ketan Mane. Bioinformatics at IU What is Bioinformatics? Bioinformatics is the study of the inherent structure of biological information.
Threshold selection in gene co- expression networks using spectral graph theory techniques Andy D Perkins*,Michael A Langston BMC Bioinformatics 1.
GENIE – GEne Network Inference with Ensemble of trees Van Anh Huynh-Thu Department of Electrical Engineering and Computer Science, Systems and Modeling,
CISC667, F05, Lec26, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Genetic networks and gene expression data.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Discrimination and clustering with microarray gene expression data Terry Speed, Jane Fridlyand, Yee Hwa Yang and Sandrine Dudoit* Department of Statistics,
Part II: Discriminative Margin Clustering Joint work with: Rob Tibshirani, Dept of Statistics Patrick O. Brown, School of Medicine Stanford University.
Functional genomics and inferring regulatory pathways with gene expression data.
Yeast Dataset Analysis Hongli Li Final Project Computer Science Department UMASS Lowell.
Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break 14:45 – 15:15Regulatory pathways lecture 15:15 – 15:45Exercise.
‘Gene Shaving’ as a method for identifying distinct sets of genes with similar expression patterns Tim Randolph & Garth Tan Presentation for Stat 593E.
Discrimination Methods As Used In Gene Array Analysis.
Graph, Search Algorithms Ka-Lok Ng Department of Bioinformatics Asia University.
Bacterial Physiology (Micr430)
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
Biological networks Construction and Analysis. Recap Gene regulatory networks –Transcription Factors: special proteins that function as “keys” to the.
CISC667, F05, Lec24, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) DNA Microarray, 2d gel, MSMS, yeast 2-hybrid.
Inferring the nature of the gene network connectivity Dynamic modeling of gene expression data Neal S. Holter, Amos Maritan, Marek Cieplak, Nina V. Fedoroff,
6. Gene Regulatory Networks
DIMACS Workshop on Machine Learning Techniques in Bioinformatics 1 Cancer Classification with Data-dependent Kernels Anne Ya Zhang (with Xue-wen.
Modeling Gene Interactions in Disease CS 686 Bioinformatics.
Epistasis Analysis Using Microarrays Chris Workman.
Cristina Manfredotti D.I.S.Co. Università di Milano - Bicocca An Introduction to the Use of Bayesian Network to Analyze Gene Expression Data Cristina Manfredotti.
Gene Expression Analysis using Microarrays Anne R. Haake, Ph.D.
1 Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data Presented by: Tun-Hsiang Yang.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Paola CASTAGNOLI Maria FOTI Microarrays. Applicazioni nella genomica funzionale e nel genotyping DIPARTIMENTO DI BIOTECNOLOGIE E BIOSCIENZE.
1 Harvard Medical School Transcriptional Diagnosis by Bayesian Network Hsun-Hsien Chang and Marco F. Ramoni Children’s Hospital Informatics Program Harvard-MIT.
Graph-based consensus clustering for class discovery from gene expression data Zhiwen Yum, Hau-San Wong and Hongqiang Wang Bioinformatics, 2007.
Classification of multiple cancer types by multicategory support vector machines using gene expression data.
Whole Genome Expression Analysis
Network Analysis and Application Yao Fu
CSE182 L14 Mass Spec Quantitation MS applications Microarray analysis.
Gene Regulatory Network Inference. Progress in Disease Treatment  Personalized medicine is becoming more prevalent for several kinds of cancer treatment.
Gene Expression Data Qifang Xu. Outline cDNA Microarray Technology cDNA Microarray Technology Data Representation Data Representation Statistical Analysis.
Microarrays to Functional Genomics: Generation of Transcriptional Networks from Microarray experiments Joshua Stender December 3, 2002 Department of Biochemistry.
Finish up array applications Move on to proteomics Protein microarrays.
Introduction to Proteomics 1. What is Proteomics? Proteomics - A newly emerging field of life science research that uses High Throughput (HT) technologies.
Reconstructing gene networks Analysing the properties of gene networks Gene Networks Using gene expression data to reconstruct gene networks.
HUMAN-MOUSE CONSERVED COEXPRESSION NETWORKS PREDICT CANDIDATE DISEASE GENES Ala U., Piro R., Grassi E., Damasco C., Silengo L., Brunner H., Provero P.
Analysis of the yeast transcriptional regulatory network.
Methods of data fusion in information retrieval rank vs. score combination D. Frank Hsu Department of Computer and Information Science Fordham University.
Evolutionary Algorithms for Finding Optimal Gene Sets in Micro array Prediction. J. M. Deutsch Presented by: Shruti Sharma.
Genomics and Forensics
Metabolic Network Inference from Multiple Types of Genomic Data Yoshihiro Yamanishi Centre de Bio-informatique, Ecole des Mines de Paris.
Data Mining the Yeast Genome Expression and Sequence Data Alvis Brazma European Bioinformatics Institute.
An overview of Bioinformatics. Cell and Central Dogma.
Literature Survey: Microarray Data Analysis Ei-Ei Gaw Arizona State University CSE 591 April 24, 2003.
Introduction to Microarrays Kellie J. Archer, Ph.D. Assistant Professor Department of Biostatistics
Biclustering of Expression Data by Yizong Cheng and Geoge M. Church Presented by Bojun Yan March 25, 2004.
Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Vu, T. T.,
CSE182 L14 Mass Spec Quantitation MS applications Microarray analysis.
Research Aspects. Research Aspects. 1Microarrays cDNAgDNACpGDNA uRNA probesOligonucleotideantibody 2Bioinformatics Databasealgorithmsoftware Combined gene.
1 Survey of Biodata Analysis from a Data Mining Perspective Peter Bajcsy Jiawei Han Lei Liu Jiong Yang.
1 CISC 841 Bioinformatics (Fall 2008) Review Session.
Inferring Regulatory Networks from Gene Expression Data BMI/CS 776 Mark Craven April 2002.
Bioinformatics Overview
Microarray Technology and Applications
Molecular Classification of Cancer
Genomes and Their Evolution
CSCI2950-C Lecture 13 Network Motifs; Network Integration
Schedule for the Afternoon
Bioinformatics, Vol.17 Suppl.1 (ISMB 2001) Weekly Lab. Seminar
Evaluating Classifiers for Disease Gene Discovery
Presentation transcript:

Networks and Algorithms in Bio-informatics D. Frank Hsu Fordham University *Joint work with Stuart Brown; NYU Medical School Hong Fang Liu; Columbia School of Medicine and Students at Fordham, Columbia, and NYU

Outlines (1) Networks in Bioinformatics (2) Micro-array Technology (3) Data Analysis and Data Mining (4) Rank Correlation and Data Fusion (5) Remarks and Further Research

(1) Networks in Bioinformatics (A)Real Networks Gene regulatory networks, Metabolic networks, Protein-interaction networks. (B)Virtual Networks Network of interacting organisms, Relationship networks. (C)Abstract Networks Cayley networks, etc.

(1) Networks in Bioinformatics, (A)&(B) DNA RNAProtein Biosphere - Network of interacting organisms Organism - Network of interacting cells Cell - Network of interacting Molecules Molecule - Genome, transcriptome, Proteome

The DBRF Method for Inferring a Gene Network S. Onami, K. Kyoda, M. Morohashi, H. Kitano In “ Foundations of Systems Biology, ” 2002 Presented by Wesley Chuang

Positive vs. Negative Circuit

Difference Based Regulation Finding Method (DBRF)

Inference Rule of Genetic Interaction Gene a activates (represses) gene b if the expression of b goes down (up) when a is deleted.

Parsimonious Network The route consists of the largest number of genes is the parsimonious route; others are redundant. The regulatory effect only depends on the parity of the number negative regulations involved in the route.

Algorithm for Parsimonious Network

A Gene Regulatory Network Model W: connection weight h a : effect of general transcription factor λ a : degradation (proteolysis) rate v a : expression level of gene a R a : max rate of synthesis g(u): a sigmoidal function node: gene edge: regulation Parameters were randomly determined.

Experiment Results Sensitivity: the percentage of edges in the target network that are also present in the inferred network. Specificity: the percentage of edges in the inferred network that are also present in the target network N: gene number K: max indegree

Continuous vs. Binary Data

DBRF vs. Predictor Method

Inferred (Yeast) Gene Network

Known vs. Inferred Gene Network

Conclusion Applicable to continuous values of expressions. Scalable for large-scale gene expression data. DBRF is a powerful tool for genome-wide gene network analysis.

(3) Data Analysis and Data Mining cDNA microarray & high-clesity oligonucleotide chips Gene expression levels, Classification of tumors, disease and disorder (already known or yet to be discovered) Drug design and discovery, treatment of cancer, etc.

(3) Data Analysis and Data Mining c1c1 t1t1 c2c2 t2t2 c3c3 t3t3 … cncn tntn g1g1 g2g2 g3g3 : gpgp

Tumor classification - three methods (a) identification of new/unknown tumor classes using gene expression profiles. (Cluster analysis/unsupervised learning) (b) classification of malignancies into known classes. (discriminant analysis/supervised learning) (c) the identification of “ marker ” genes that characterize the different tumor classes (variable selection).

(3) Data Analysis and Data Mining Cancer classification and identification (a)HC – hierarchical clustering methods, (b)SOM – self-organizing map, (c)SVM – support vector machines.

(3) Data Analysis and Data Mining Prediction methods (Discrimination methods) (a)FLDA – Fisher ’ s linear discrimination analysis (b)ML – Maximum likelihood discriminat rule, (c)NN – nearest neighbor, (d)Classification trees, (e)Aggregating classifiers.

Rank Correlation and Data Fusion Problem 1: For what A and B, P(C)(or P(D))>max{P(A),P(B)}? Problem 2: For what A and B, P(C)>P(D)?

x r A (x) s A (x) (a) Ranked list A x r B (x) s B (x) (b) Ranked list B

x f AB (x) s f (x) r C (x) (c) Combination of A and B by rank x g AB (x) s g (x) r D (x) (d) Combinations of A and B by score

Theorem 3: Let A, B, C and D be defined as before. Let s A =L and s B =L 1  L 2 (L 1 and L 2 meet at (x*, y*) be defined as above). Let r A =e A be the identity permutation. If r B =t 。 e A, where t= the transposition (i,j), (i<j), and q<x*, then (C)  (D).

(S 4,S) where S={(1,2),(2,3),(3,4)}

(S 4,T) where T={(i,j)|i  j}

References 1.Lenwood S. Heath; Networks in Bioinformatics, I-SPAN ’ 02, May 2002, IEEE Press, (2002), Minoru Kanehisa; Prediction of higher order functional networks from genomie data, Bharnacogonomics (2)(4), (2001), D. F. Hsu, J. Shapiro and I. Taksa; Methods of data fusion in information retrieval; rank vs. score combination, DIMACS Technical Report , (2002) 4.M. Grammatikakis, D. F. Hsu, and M. Kratzel; Parallel system interconnection and communications, CRC Press(2001). 5.S. Dudoit, J. Fridlyand and T. Speed; Comparison of discrimination methods for the classification of tumors using gene expressions data, UC Berkeley, Technical Report #576, (2000).