Protein- Cytokine network reconstruction using information theory-based analysis Farzaneh Farhangmehr UCSD Presentation#3 July 25, 2011.

Slides:



Advertisements
Similar presentations
Information Theory EE322 Al-Sanie.
Advertisements

An Information-Theoretic Framework for Flow Visualization Lijie Xu, Teng-Yok Lee, & Han-Wei Shen The Ohio State University.
Enhancing Secrecy With Channel Knowledge
Chain Rules for Entropy
Chapter 6 Information Theory
Visual Recognition Tutorial
Background Knowledge Brief Review on Counting,Counting, Probability,Probability, Statistics,Statistics, I. TheoryI. Theory.
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
Mutual Information Mathematical Biology Seminar
Fundamental limits in Information Theory Chapter 10 :
On the interdependence of routing and data compression in multi-hop sensor networks Anna Scaglione, Sergio D. Servetto.
Distributed Source Coding 教師 : 楊士萱 老師 學生 : 李桐照. Talk OutLine Introduction of DSCIntroduction of DSC Introduction of SWCQIntroduction of SWCQ ConclusionConclusion.
3-1 Introduction Experiment Random Random experiment.
Information Theory and Security
The Role of Specialization in LDPC Codes Jeremy Thorpe Pizza Meeting Talk 2/12/03.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 13 June 22, 2005
Noise, Information Theory, and Entropy
Lecture II-2: Probability Review
Modern Navigation Thomas Herring
Big Challenges with Big Data in Life Sciences Shankar Subramaniam UC San Diego.
X= {x 0, x 1,….,x J-1 } Y= {y 0, y 1, ….,y K-1 } Channel Finite set of input (X= {x 0, x 1,….,x J-1 }), and output (Y= {y 0, y 1,….,y K-1 }) alphabet.
Noise, Information Theory, and Entropy
1 Statistical NLP: Lecture 5 Mathematical Foundations II: Information Theory.
Some basic concepts of Information Theory and Entropy
©2003/04 Alessandro Bogliolo Background Information theory Probability theory Algorithms.
Huffman Coding Vida Movahedi October Contents A simple example Definitions Huffman Coding Algorithm Image Compression.
§1 Entropy and mutual information
2. Mathematical Foundations
Information Theory & Coding…
INFORMATION THEORY BYK.SWARAJA ASSOCIATE PROFESSOR MREC.
Dependency networks Sushmita Roy BMI/CS 576 Nov 26 th, 2013.
§4 Continuous source and Gaussian channel
Discrete Mathematical Structures (Counting Principles)
June 21, 2007 Minimum Interference Channel Assignment in Multi-Radio Wireless Mesh Networks Anand Prabhu Subramanian, Himanshu Gupta.
Chapter 11: The Data Survey Supplemental Material Jussi Ahola Laboratory of Computer and Information Science.
Yaomin Jin Design of Experiments Morris Method.
Channel Capacity.
COMMUNICATION NETWORK. NOISE CHARACTERISTICS OF A CHANNEL 1.
Computer Vision Group Prof. Daniel Cremers Autonomous Navigation for Flying Robots Lecture 5.2: Recap on Probability Theory Jürgen Sturm Technische Universität.
§2 Discrete memoryless channels and their capacity function
Information Theory Basics What is information theory? A way to quantify information A lot of the theory comes from two worlds Channel.
Mathematical Foundations Elementary Probability Theory Essential Information Theory Updated 11/11/2005.
Conditional Probability Mass Function. Introduction P[A|B] is the probability of an event A, giving that we know that some other event B has occurred.
Basic Concepts of Information Theory Entropy for Two-dimensional Discrete Finite Probability Schemes. Conditional Entropy. Communication Network. Noise.
Joint Power and Channel Minimization in Topology Control: A Cognitive Network Approach J ORGE M ORI A LEXANDER Y AKOBOVICH M ICHAEL S AHAI L EV F AYNSHTEYN.
1 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan Room: C3-222, ext: 1204, Lecture 10 Rate-Distortion.
1 Lecture 7 System Models Attributes of a man-made system. Concerns in the design of a distributed system Communication channels Entropy and mutual information.
R.Kass/F02 P416 Lecture 1 1 Lecture 1 Probability and Statistics Introduction: l The understanding of many physical phenomena depend on statistical and.
The Unscented Particle Filter 2000/09/29 이 시은. Introduction Filtering –estimate the states(parameters or hidden variable) as a set of observations becomes.
Basic Concepts of Information Theory A measure of uncertainty. Entropy. 1.
Chapter 2: Probability. Section 2.1: Basic Ideas Definition: An experiment is a process that results in an outcome that cannot be predicted in advance.
Channel Coding Theorem (The most famous in IT) Channel Capacity; Problem: finding the maximum number of distinguishable signals for n uses of a communication.
Samuel Cheng, Shuang Wang and Lijuan Cui University of Oklahoma
Chapter 5 Joint Probability Distributions and Random Samples  Jointly Distributed Random Variables.2 - Expected Values, Covariance, and Correlation.3.
Mutual Information and Channel Capacity Multimedia Security.
Network Topology Single-level Diversity Coding System (DCS) An information source is encoded by a number of encoders. There are a number of decoders, each.
UNIT I. Entropy and Uncertainty Entropy is the irreducible complexity below which a signal cannot be compressed. Entropy is the irreducible complexity.
UNIT –V INFORMATION THEORY EC6402 : Communication TheoryIV Semester - ECE Prepared by: S.P.SIVAGNANA SUBRAMANIAN, Assistant Professor, Dept. of ECE, Sri.
(C) 2000, The University of Michigan 1 Language and Information Handout #2 September 21, 2000.
Chapter 4: Information Theory. Learning Objectives LO 4.1 – Understand discrete and continuous messages, message sources, amount of information and its.
Statistical methods in NLP Course 2 Diana Trandab ă ț
Basic Concepts of Information Theory Entropy for Two-dimensional Discrete Finite Probability Schemes. Conditional Entropy. Communication Network. Noise.
Statistical methods in NLP Course 2
Estimating standard error using bootstrap
Introduction to Information theory
Hiroki Sayama NECSI Summer School 2008 Week 3: Methods for the Study of Complex Systems Information Theory p I(p)
Context-based Data Compression
3.1 Expectation Expectation Example
Review of Probability and Estimators Arun Das, Jason Rebello
Subject Name: Information Theory Coding Subject Code: 10EC55
Presentation transcript:

Protein- Cytokine network reconstruction using information theory-based analysis Farzaneh Farhangmehr UCSD Presentation#3 July 25, 2011

What is Information Theory ? Information is any kind of events that affects the state of a dynamic system Information theory deals with measurement and transmission of information through a channel Information theory answers two fundamental questions: what is the ultimate reliable transmission rate of information? (the channel capacity C) What is the ultimate data compression (the entropy H)

Key elements of information theory Entropy H(X): A measure of uncertainty associated with a random variables Quantifies the expected value of the information contained in a message (Shannon, 1948) Capacity (C): If the entropy of the source is less than the capacity of the channel, asymptotically error-free communication can be achieved. The capacity of a channel is the tightest upper bound on the amount of information that can be reliably transmitted over the channel.

Key elements of information theory Joint Entropy: The joint entropy H(X,Y) of a pair of discrete random variables (X, Y) with a joint distribution p(x, y): Conditional entropy: - Quantifies the remaining entropy (i.e. uncertainty) of a random variable Y given that the value of another random variable X is known.

Key elements of information theory Mutual Information I(X;Y): - The reduction in the uncertainty of X due to the knowledge of Y I(X;Y) = H(X) + H(Y) -H(X,Y) = H(Y) - H(YlX) = H(X) - H(XlY)

Basic principles of information-theoretic model of network reconstruction The entire framework of network reconstruction using information theory has two stages: 1-mutual information coefficients computation; 2- the threshold determination. Mutual information networks rely on the measurement of the mutual information matrix (MIM). MIM is a square matrix whose elements (MIM ij = I(X i ;Y j )) are the mutual information between X i and Y j. Choosing a proper threshold is a non-trivial problem. The usual way is to perform permutations of expression of measurements many times and recalculate a distribution of the mutual information for each permutation. Then distributions are averaged and the good choice for the threshold is the largest mutual information value in the averaged permuted distribution. ARCANe, CLR, MRnet, etc

Advantages of information theoretic model to other available methods for network reconstruction Mutual information makes no assumptions about the functional form of the statistical distribution, so it’s a non-parametric method. It doesn’t requires any decomposition of the data into modes and there is no need to assume additivity of the original variables Since it doesn’t need any binning to generate the histograms, consumes less computational resources.

Information-theoretic model of networks X={x 1, …,x i } Y={y 1, …,y j } We want to find the best model that maps X Y The general definition: Y= f(X)+U In linear cases: Y=[A]X+U where [A] is a matrix defines the linear dependency of inputs and outputs Information theory provides both models (linear and non-linear) and maps inputs to outputs by using the mutual information function:

Key elements of information theory-based networks interface

Algorithm for the Reconstruction of Accurate Cellular Networks( ARACNE) ARACNe is an information-theoretic algorithm for reconstructing networks from microarray data. ARACNe follows these steps: - It assign to each pair of nodes a weight equal to their mutual information. - It then scans all nodes and removes the weakest edge. Eventually, a threshold value is used to eliminate the weakest edges. - At this point, it calculates the mutual information of the system with Kernel density estimators and assigns a p value, P (joint probability of the system) to find a new threshold. - The above steps are repeated until a reliable threshold up to P= is obtained.

Protein-Cytokine network: Histograms and probability mass functions 22 Signaling proteins responsible for cytokine releases: cAMP, AKT, ERK1, ERK2, Ezr/Rdx, GSK3A, GSK3B, JNK lg, JNK sh, MSN, p38, p40Phox, NFkB p65, PKCd, PKCmu2,RSK, Rps6, SMAD2, STAT1a, STAT1b, STAT3, STAT5 7 released cytokines (as signal receivers): G-CSF, IL-1a, IL-6, IL-10, MIP-1a, RANTES, TNFa Using information-theoretic model we want to reconstruct this network from the microarray data and determine what proteins are responsible for each cytokine releases

Protein-Cytokine network: Histograms and probability mass functions First step: Finding the probability mass distributions of cytokines and proteins. Using the information theory, we want to identify signaling proteins responsible for cytokine releases. we reconstruct the network using the information theory techniques. The two pictures on the left show the histograms and probability mass functions of cytokines and proteins.

Protein-Cytokine network: The joint probability mass functions

Protein-Cytokine network: Mutual information for each 22*7 connections Third step: The mutual information for each 22*7 connections by calculating marginal and joint entropy.

Protein-Cytokine network: Finding the proper threshold Step 4: ARACNe algorithm to find the proper threshold using the mutual information from step 3. Using sample size 10,000 and kernel width 0.15, the algorithm is repeated for assigned p values of the joint probability of the system and turns a threshold for each step. The thresholds produced by the algorithm becomes stable after several iterations that means the multi information of the system has become reliable until p= This threshold (0.7512) is used to discard the weak connections. The remaining connections are used to reconstruct the network.

Protein-Cytokine network: Network reconstruction by keeping the connections above the threshold Step 5: After finding the threshold, all connections above the threshold are used to find the topology of each node. Scanning all nodes (as receiver or source) turns out the network. The left picture shows the reconstructed network of protein- cytokine from the microarray data using the information-theoretic model.

Questions?