Exploring gene pathway interactions using SOM Keala Chan SoCalBSI August 20, 2004.

Slides:



Advertisements
Similar presentations
Memristor in Learning Neural Networks
Advertisements

Organizing a spectral image database by using Self-Organizing Maps Research Seminar Oili Kohonen.
Abstract BarleyBase ( is a USDA-funded public repository for plant microarray data. BarleyBase houses raw and normalized expression.
TEMPLATE DESIGN © Self Organized Neural Networks Applied to Animal Communication Abstract Background Objective The main.
Kohonen Self Organising Maps Michael J. Watts
Self-Organizing Map (SOM). Unsupervised neural networks, equivalent to clustering. Two layers – input and output – The input layer represents the input.
Non-linear Dimensionality Reduction CMPUT 466/551 Nilanjan Ray Prepared on materials from the book Non-linear dimensionality reduction By Lee and Verleysen,
X0 xn w0 wn o Threshold units SOM.
Making Sense of Complicated Microarray Data Part II Gene Clustering and Data Analysis Gabriel Eichler Boston University Some slides adapted from: MeV documentation.
Self Organizing Maps. This presentation is based on: SOM’s are invented by Teuvo Kohonen. They represent multidimensional.
SocalBSI 2008: Clustering Microarray Datasets Sagar Damle, Ph.D. Candidate, Caltech  Distance Metrics: Measuring similarity using the Euclidean and Correlation.
Microarray GEO – Microarray sets database
Microarray Data Preprocessing and Clustering Analysis
Pathway Analysis Michael Sneddon Southern California Bioinformatics Institute August 20, 2004.
Dimension reduction : PCA and Clustering by Agnieszka S. Juncker
Visualization of AAG Paper Abstracts André Skupin Dept. of Geography University of New Orleans AAG Pittsburgh, April 5, 2000.
Interactive Exploration of Hierarchical Clustering Results HCE (Hierarchical Clustering Explorer) Jinwook Seo and Ben Shneiderman Human-Computer Interaction.
Cluster analysis  Function  Places genes with similar expression patterns in groups.  Sometimes genes of unknown function will be grouped with genes.
Introduction to Bioinformatics - Tutorial no. 12
Microarray-based Disease Prognosis using Gene Annotation Signatures Michael Kovshilovsky Swapna Annavarapu SoCalBSI 2005.
Cluster Analysis Hierarchical and k-means. Expression data Expression data are typically analyzed in matrix form with each row representing a gene and.
GCB/CIS 535 Microarray Topics John Tobias November 15 th, 2004.
1 A Rank-by-Feature Framework for Interactive Exploration of Multidimensional Data Jinwook Seo, Ben Shneiderman University of Maryland Hyun Young Song.
Microarray Gene Expression Data Analysis A.Venkatesh CBBL Functional Genomics Chapter: 07.
Health and CS Philip Chan. DNA, Genes, Proteins What is the relationship among DNA Genes Proteins ?
Copyright 2000, Media Cybernetics, L.P. Array-Pro ® Analyzer Software.
Project reminder Deadline: Monday :00 Prepare 10 minutes long pesentation (in Czech/Slovak), which you’ll present on Wednesday during.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Student : Sheng-Hsuan Wang Department.
Clustering of DNA Microarray Data Michael Slifker CIS 526.
Introduction to DNA Microarray Technology Steen Knudsen Uma Chandran.
Self Organizing Maps (SOM) Unsupervised Learning.
CZ5225: Modeling and Simulation in Biology Lecture 5: Clustering Analysis for Microarray Data III Prof. Chen Yu Zong Tel:
Artificial Neural Networks Dr. Abdul Basit Siddiqui Assistant Professor FURC.
More on Microarrays Chitta Baral Arizona State University.
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
Self-Organising Networks This is DWC-lecture 8 of Biologically Inspired Computing; about Kohonen’s SOM, what it’s useful for, and some applications.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Adaptive nonlinear manifolds and their applications to pattern.
Stephen Marsland Ch. 9 Unsupervised Learning Stephen Marsland, Machine Learning: An Algorithmic Perspective. CRC 2009 based on slides from Stephen.
Kohonen Mapping and Text Semantics Xia Lin College of Information Science and Technology Drexel University.
Dr Paul Lewis Lecturer in Bioinformatics Lecturer in Bioinformatics Cardiff University Cardiff University Biostatistics & Bioinformatics Unit Biostatistics.
Machine Learning Neural Networks (3). Understanding Supervised and Unsupervised Learning.
A Short Overview of Microarrays Tex Thompson Spring 2005.
Multidimensional Scaling Vuokko Vuori Based on: Data Exploration Using Self-Organizing Maps, Samuel Kaski, Ph.D. Thesis, 1997 Multivariate Statistical.
Clustering What is clustering? Also called “unsupervised learning”Also called “unsupervised learning”
1 Course #412 Analyzing Microarray Data using the mAdb System April 1-2, :00 pm - 4:00pm Intended for users of the.
Hierarchical Clustering of Gene Expression Data Author : Feng Luo, Kun Tang Latifur Khan Graduate : Chien-Ming Hsiao.
Differential analysis of Eigengene Networks: Finding And Analyzing Shared Modules Across Multiple Microarray Datasets Peter Langfelder and Steve Horvath.
SOM-based Data Visualization Methods Author:Juha Vesanto Advisor:Dr. Hsu Graduate:ZenJohn Huang IDSL seminar 2002/01/24.
Unsupervised Learning
TreeSOM :Cluster analysis in the self- organizing map Neural Networks 19 (2006) Special Issue Reporter 張欽隆 D
Course Work Project Project title “Data Analysis Methods for Microarray Based Gene Expression Analysis” Sushil Kumar Singh (batch ) IBAB, Bangalore.
Nuria Lopez-Bigas Methods and tools in functional genomics (microarrays) BCO17.
Microarray analysis Quantitation of Gene Expression Expression Data to Networks BIO520 BioinformaticsJim Lund Reading: Ch 16.
Computational Biology Clustering Parts taken from Introduction to Data Mining by Tan, Steinbach, Kumar Lecture Slides Week 9.
Cluster Analysis.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology O( ㏒ 2 M) Self-Organizing Map Algorithm Without Learning.
Semiconductors, BP&A Planning, DREAM PLAN IDEA IMPLEMENTATION.
Self-Organizing Maps (SOM) (§ 5.5)
Self Organizing Maps: Clustering With unsupervised learning there is no instruction and the network is left to cluster patterns. All of the patterns within.
Hybrid Intelligent Systems for Network Security Lane Thames Georgia Institute of Technology Savannah, GA
Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003.
“Niche Work” Graham J Wills, Lucent Technologies (Bell Lab)
A Self-organizing Semantic Map for Information Retrieval Xia Lin, Dagobert Soergel, Gary Marchionini presented by Yi-Ting.
Data Mining, Neural Network and Genetic Programming
Other Applications of Energy Minimzation
Molecular Classification of Cancer
Special Topics in Geo-Business Data Analysis
Self-organizing map numeric vectors and sequence motifs
Artificial Neural Networks
Presentation transcript:

Exploring gene pathway interactions using SOM Keala Chan SoCalBSI August 20, 2004

Microarray data analysis Idea: Study relationships between functional terms or pathways Gene expression data Annotate and partition genes using functional terms

Interacting Gene Pathways Hypothesis: Some relationship exists between Pathway 1 and Pathway 4

Network of pathways Pathway 18 Pathway 4 Pathway 3 Pathway 2 Pathway 1 Pathway 35 Pathway 12

Pathway 18 Pathway 4 Pathway 3 Pathway 2 Pathway 1 Pathway 35 Pathway 12 Why use Self-Organizing Map? Serves as a data structure to represent the network Maps the network onto a 2-D grid, preserving the topological relationship between input vectors (SOM) Pathway 12Pathway 18 Pathway 1, Pathway 2 Pathway 4 Pathway 3Pathway 35

What is SOM? Tool for mapping similar input patterns onto contiguous locations in the output space 1. Clustering, or the creation of abstractions of the input space 2. Visualization of high-dimensional data in two-dimensional display The SOM has two major effects:

Example Each circle represents a number of input vectors. Hence, the input vectors have been clustered, or abstracted. Also, the topology has been preserved: neighboring representative vectors are similar. Recall: SOM maps similar input patterns onto contiguous locations in the output space, resulting in clustering of the input space and 2-D visualization of the input space

Representative vectors x x x x x x x x x x x x xx x x x x x x x The representative vector comes to represent this group of similar input vectors The best-matching (closest) representative vector and its neighbors are pulled towards the highlighted input vector 2-D representative vector

Method Partition genes into GO terms Apply GSEA Affymetrix data Recall: The general goal is to train a SOM on a large dataset to form a network of pathways for further study. Data: Human healthy tissue from 31 adult sources (brain, kidney, skin, etc … ), 108 replicants Baseline: average

Method (continued) GSEA scores Train SOM on the pathway dataset GSEA scores normalized so mean=0 and stdev=1

Visualizing first results These terms all map to, or are represented by, the same hexagon. Biological_Process_glycolysis_(10) Molecular_Function_3-oxo-5-alpha-steroid_4-dehydrogenase_(4) Molecular_Function_ATP-binding_cassette_(ABC)_transporter_(65) Molecular_Function_blood_coagulation_factor_IX_(3) Molecular_Function_blood_coagulation_factor_VII_(4) Molecular_Function_blood_coagulation_factor_X_(3) Molecular_Function_fructose-bisphosphate_aldolase_(9) Molecular_Function_interleukin_receptor_(6) Molecular_Function_pyruvate_kinase_(3) Molecular_Function_sodium:phosphate_symporter_(5) Molecular_Function_transaminase_(24) These pathways are most activated in the liver

K-means clustering k-means (15) clustering of the representative vectors groups pathways that are often activated at the same time Next: Examine which k-means clusters are activated under each condition.

Projecting a new dataset To test for pathways that interact consistently, I projected GSEA scores for 16 different brain tumor types onto the SOM Biological_Process_glycolysis_(10) Molecular_Function_3-oxo-5-alpha-steroid_4-dehydrogenase_(4) Molecular_Function_ATP-binding_cassette_(ABC)_transporter_(65) Molecular_Function_blood_coagulation_factor_IX_(3) Molecular_Function_blood_coagulation_factor_VII_(4) Molecular_Function_blood_coagulation_factor_X_(3) Molecular_Function_fructose-bisphosphate_aldolase_(9) Molecular_Function_interleukin_receptor_(6) Molecular_Function_pyruvate_kinase_(3) Molecular_Function_sodium:phosphate_symporter_(5) Molecular_Function_transaminase_(24) Mapped pathways and GSEA scores to the same location in the SOM

Brain tumor data Questions to ask: What is the best we can do with respect to the visual smoothness of the projection? What characterizes a “ good ” projection? Next: Plot histogram of distances between any two pathways mapping to the same hexagon. Calculate activation scores for kmeans clusters trained on healthy data.

Fetal tissue

Next? Validation by biologists Choose parameters wisely (projection data, normalization, distance metric) Study k-means clustering of SOM More projections on SOM

Acknowledgments SOM Toolbox All BioDiscovery software Stan Nelson Lab microarray data Michael Sneddon Dr. Bruce Hoff Dr. Soheil Shams Everyone at SoCalBSI