Ricard V. Solè and Sergi Valverde Prepared by Amaç Herdağdelen

Slides:

Advertisements

Similar presentations

CS188: Computational Models of Human Behavior

Advertisements

Designing an impact evaluation: Randomization, statistical power, and some more fun…

Lecture 2: Basic Information Theory TSBK01 Image Coding and Data Compression Jörgen Ahlberg Div. of Sensor Technology Swedish Defence Research Agency (FOI)

Estimation of Means and Proportions

CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

1 ECE 776 Project Information-theoretic Approaches for Sensor Selection and Placement in Sensor Networks for Target Localization and Tracking Renita Machado.

Mustafa Cayci INFS 795 An Evaluation on Feature Selection for Text Clustering.

Probability Distributions CSLU 2850.Lo1 Spring 2008 Cameron McInally Fordham University May contain work from the Creative Commons.

Dynamic Bayesian Networks (DBNs)

VL Netzwerke, WS 2007/08 Edda Klipp 1 Max Planck Institute Molecular Genetics Humboldt University Berlin Theoretical Biophysics Networks in Metabolism.

4. PREFERENTIAL ATTACHMENT The rich gets richer. Empirical evidences Many large networks are scale free The degree distribution has a power-law behavior.

Network Morphospace Andrea Avena-Koenigsberger, Joaquin Goni Ricard Sole, Olaf Sporns Tung Hoang Spring 2015.

Probability Distributions Finite Random Variables.

Visual Recognition Tutorial

Topic 2: Statistical Concepts and Market Returns

Machine Learning CMPT 726 Simon Fraser University

Code and Decoder Design of LDPC Codes for Gbps Systems Jeremy Thorpe Presented to: Microsoft Research

Discrete Probability Distributions

3-1 Introduction Experiment Random Random experiment.

Information Theory and Security

Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 14 From Randomness to Probability.

Review of Probability Theory. © Tallal Elshabrawy 2 Review of Probability Theory Experiments, Sample Spaces and Events Axioms of Probability Conditional.

INFORMATION THEORY BYK.SWARAJA ASSOCIATE PROFESSOR MREC.

Large-scale organization of metabolic networks Jeong et al. CS 466 Saurabh Sinha.

Optimization Based Modeling of Social Network Yong-Yeol Ahn, Hawoong Jeong.

Theory of Probability Statistics for Business and Economics.

The Laws of Thermodynamics

Chapter 4 Correlation and Regression Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.

Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.

Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.

Yaomin Jin Design of Experiments Morris Method.

Random Variables. A random variable X is a real valued function defined on the sample space, X : S  R. The set { s  S : X ( s )  [ a, b ] is an event}.

Simulated Annealing.

Simulation is the process of studying the behavior of a real system by using a model that replicates the behavior of the system under different scenarios.

1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.

Intel Confidential – Internal Only Co-clustering of biological networks and gene expression data Hanisch et al. This paper appears in: bioinformatics 2002.

Outline Transmitters (Chapters 3 and 4, Source Coding and Modulation) (week 1 and 2) Receivers (Chapter 5) (week 3 and 4) Received Signal Synchronization.

Lecture V Probability theory. Lecture questions Classical definition of probability Frequency probability Discrete variable and probability distribution.

Stats Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.

Lecture 2 Molecular dynamics simulates a system by numerically following the path of all particles in phase space as a function of time the time T must.

Basic Principles (continuation) 1. A Quantitative Measure of Information As we already have realized, when a statistical experiment has n eqiuprobable.

Chapter 13: Thermodynamics

Lecture 2: Statistical learning primer for biologists

Sampling and estimation Petter Mostad

1 1 Slide Simulation Professor Ahmadi. 2 2 Slide Simulation Chapter Outline n Computer Simulation n Simulation Modeling n Random Variables and Pseudo-Random.

1 BA 555 Practical Business Analysis Linear Programming (LP) Sensitivity Analysis Simulation Agenda.

1 1 Slide © 2004 Thomson/South-Western Simulation n Simulation is one of the most frequently employed management science techniques. n It is typically.

Entropy (YAC- Ch. 6)  Introduce the thermodynamic property called Entropy (S)  Entropy is defined using the Clausius inequality  Introduce the Increase.

Machine Learning 5. Parametric Methods.

Review of statistical modeling and probability theory Alan Moses ML4bio.

Basic Concepts of Information Theory A measure of uncertainty. Entropy. 1.

Chapter 8: Introduction to Hypothesis Testing. Hypothesis Testing A hypothesis test is a statistical method that uses sample data to evaluate a hypothesis.

CHAPTER 3: BAYESIAN DECISION THEORY. Making Decision Under Uncertainty Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,

Introduction to ERGM/p* model Kayo Fujimoto, Ph.D. Based on presentation slides by Nosh Contractor and Mengxiao Zhu.

AP Statistics From Randomness to Probability Chapter 14.

Model Comparison. Assessing alternative models We don’t ask “Is the model right or wrong?” We ask “Do the data support a model more than a competing model?”

Shannon Entropy Shannon worked at Bell Labs (part of AT&T)

Outline Introduction Signal, random variable, random process and spectra Analog modulation Analog to digital conversion Digital transmission through baseband.

Ch9: Decision Trees 9.1 Introduction A decision tree:

Sampling Distributions

Towards Measuring Anonymity

Effective Social Network Quarantine with Minimal Isolation Costs

Honors Statistics From Randomness to Probability

Boltzmann Machine (BM) (§6.4)

Discrete Random Variables: Basics

Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.

MGS 3100 Business Analysis Regression Feb 18, 2016

Presentation transcript:

Ricard V. Solè and Sergi Valverde Prepared by Amaç Herdağdelen Information Theory of Complex Networks: on evolution and architectural constraints Ricard V. Solè and Sergi Valverde Prepared by Amaç Herdağdelen

Introduction Complex systems as complex networks of interactions Metabolic networks, software class diagrams, electronic circuits Describing complex networks by quantitative measures: Degree distribution (exponential, power law, “normal”) Statistical properties (average degree, clustering, diameter)

Problem The space of possible networks are much more complex Average statistics lack capturing all essential features and providing insight Need for additional measures to analyze and classify complex networks

Possible Measures Heterogeneity Randomness Modularity How heterogeneous the nodes are (based on degree) Randomness Is there an underlying order? Modularity Is there a hierarchical organization

Zoo of Complex Networks

Notation G = (V,E), classical graph representation k(i) degree of node i P(k): Degree distribution (as probability, summing to 1) q(k): “Remaining degrees”: Choose a random edge. q(k) is the probability that the edge goes out of a node with (k+1) degree. <k> = Average degree

Notation k P(k) (out) q(k) (out) 0.5 = 8/16 1 0.89 = 8/9 2 3 4 5 6 7 8 0.5 = 8/16 1 0.89 = 8/9 2 3 4 5 6 7 8 0.11 = 1/9 q(k) = [(k + 1) * P(k+1)] / <k>

Degree vs. Remaining Degree Classical Degree Remaining Degree Random Graph Evenly distributed degrees

An Example Measure Assortative Mixing (AM) Disassortative Mixing (DM) High degree nodes tend to link to high degree nodes Found in social networks Disassortative Mixing (DM) The reverse, high degree nodes tend to link to low degree nodes Found in biological networks

An Example Measure qc(i,j) = The probability that a randomly chosen edge will be between two nodes with “remaining degrees” i and j. For no assortative case (no AM/DM): qc(i,j) = q(i) * q(j) (Both degrees are independent) Assortativeness measure r: Related to the value E(qc(i,j)) – E(q(i) * q(j)) Normalized such that -1 < r < 1 -1 means highly DM, +1 means highly AM

An Example Measure High AM (r > 0) High DM (r < 0) b q(a) = i q(b) = ? High AM (r > 0) High DM (r < 0) No AM/DM (r = 0) k(b) With high probability ~i With high probability different than i (either higher or lower) No conclusion can be drawn

Entropy and Information Entropy is defined in several domains. The relevant ones are: Thermodynamic Entropy (Clausius): Measure of the amount of energy in a physical system which cannot be used to do work Statistical Entropy (Boltzmann): A measure of how ordered a system is: Information Entropy (Shannon): A measure of how random a signal or random event is

Information Information of a message is a measure of the decrease of uncertainty at the receiver: Receiver M = ? (1? 2? .. 100?) Sender (M = 5) Message: [M = 5] Sender (M = 5) Receiver M = 5!

Information Entropy The more uncertainty the more information Let x be the result of a toss (x = H or x = T) Unbiased coin (P(H)=½, P(T)=½), x carries 1 bit of information (knowing x gives 1 bit information) Biased coin (P(H)=0.9, P(T)=0.1) x does not contain that much information. (the decrease of uncertainty at the receiver is low, compare it with the possible values for M in the previous example) The more uncertain (random) a message to the outsider is, the more information it carries!

Information Entropy Information ~ Uncertainty and Information Entropy is a measure of randomness of an event Entropy = Information carried by an event High entropy corresponds to more informative, random events Low entropy corresponds to less informative, ordered events Consider Turkish. A Turkish text of 10 letters does not contain “10-letters of information” (try your fav. compression algorithm on a Turkish text, for English it is found that a letter carries out 1.5 bits of information)

Information Entropy Formally: H(x) = Entropy of an event x (eg. a message) i = [1..n] all possible outcomes for x p(i): Probability that i. outcome will occur The more random the event (probabilities are equal) the higher entropy Highest possible entropy = log(n)

Information Entropy For a Bernoulli trial (X = {0,1}) the graph of entropy vs. Pr(X = 1). The highest H(X) = 1 = log(2)

Example Entropy Calculations H = 3.3219 H = 3.1036 H = 2.7251 H = 1.2764

So What? Any questions so far?

So What? Apply the information theory and the entropy as a measure of the “orderedness” of a graph Remember assortativeness? It is a measure of correlation but only works when there is a linear relation between two variables (qc(i,j)). Mutual Information between two variables is a more general measure which captures non-linear relation When I know about X, how much do I know about Y?

Measures (Network Entropy) Heterogeneity of the degrees of nodes (Noise) Entropy of the probability distribution of observing a node with remaining degree k given that the node at the other end of the chosen edge has k’ leaving edges (Information Transfer) Mutual information between degrees of two neighbor nodes

Results Noise versus network entropy, the line consists of points where information transfer is 0 (H(q) = H(q|q’))

Results Low information transfer means knowing a degree of a node does not tell us much about the degrees of its neighbors: Small assortativeness Looks like many (if not all) complex networks are heterogeneous (high entropy) and have low degree correlations Are degree correlations irrelevant? Or are they non-existent for some reason?

Results Maybe there is a selective pressure that favors the networks with heterogeneous distribution and low assortativeness when a complexity limit is reached A Monte Carlo search by simulated annealing is performed to provide evidence which suggests this is NOT the case

Monte Carlo Search Search is done in the multi-dimensional space of all networks with N nodes and E edges 2 dimensional parameter space for networks: H: Entropy and Hc: Noise For every random sample. Find corresponding point (H,Hc) Perfomr a Monte Carlo search to minimize following potential function for a candidate graph Ω: By looking at the error for Ω (ε(Ω)) we can calculate a likelihood for Ω. This value gives us a measure of how likely it is to reach Ω from a given random point.

Results The candidate graphs that occupy the same region with the observed real graphs appeared as the most likely graphs. Note that for a very large portion of the theoretically possible network space it is almost impossible to obtain graphs located in that area. The area with high likelihood is the place where the scale-free networks reside.

Discussion The authors claim that the observed lack of degree correlation and high heterogeneity is not a result of adaptation or parameter selection but a result of higher-level limitations (!) on network architectures Without assuming a particular network growth model they showed that a very specific domain of all possible networks are attainable by an optimization algorithm, outside this domain it is not feasible find graphs that satisfies the complexity constraints These results and formulation might be a step towards explaining why so different networks operating/evolving under so different conditions have many common properties

Thank you for your attention ?

"In many different domains with different constraints, the systems usually end up with networks that fall in the specific area we found in the monte-carlo simulations. This is not only because of some evolutionary (or other) constraints which favor the networks in that area but also because most of the networks actually reside in that area. We mean, even if there was a constraint in the system that favors the networks whose entropy/noise values fall outside of the mentioned area, the system would be unsuccesful most of the time in its search/evolution/development for such a network (just as our monte-carlo search did)".