Relative complexity measures See also: R. Badii, A. Politi. Complexity. Cambridge University Press. 1997
xxx n : 2 relative measures Information is rarely absolute usually relative to some other information the state of another, coupled, system e.g. beacon signalling the fall of Troy and return of Agamemnon (~ 1200 BC) one bit signal – a lot of “information” – in the coupled system that one bit in a different context would not carry the same message the state of the same system at another place or time the flow of information through space/time : computation e.g. earlier CA examples used entropy variance joint entropy; conditional entropy; mutual information
xxx n : 3 Useful concepts: joint probability (1) p(x,y) = probability that a pair of elements drawn at random from X,Y will have value x,y finite sets X, Y of size N X N Y, with elements x i, y i p(x,y) = p(x) p(y) X and Y are independent e.g. toss a coin and throw a die p cd ( ,5) = probability of tossing a head and throwing a 5 p cd ( ,5) = p( ) p(5) = 1/2 × 1/6 X Y
xxx n : 4 Useful concepts: joint probability (2) First die Second die c.f. probability of first die being 1 or total of both being 6 independent p 1st (1) p sum (6) = 1/6 × 5/36 = 5/216 probability of first die being 1 and total of both being 6 not independent p 12 (1,6) = 1/6 × 1/6 = 1/36 Dependent events 3457
xxx n : 5 Useful concepts: joint probability (3) p(x,y) = p(y) X is determined completely by Y probability of throwing an even number (E) and throwing a 6 6 is even, so if throw 6, throw even p d (E,6) = p(6) = 1/
xxx n : 6 joint entropy: independent systems joint entropy of systems X and Y uses the relevant joint probability: H(X,Y) = H(X) + H(Y) X and Y are independent H(coin, die) = H(coin) + H(die) = log log 2 6 H(X,Y) = H(Y) X is determined completely by Y H(parity, die) = H(die) = log 2 6
xxx n : 7 Entropy of independent systems is additive H(X,Y) = H(X) + H(Y) X and Y are independent consider a string S of N S characters, each of N C bits If a the characters of the string are independent entropy of the string of bits = entropy of string of characters of bits:...… String of characters of bits possible different characters = 2 N C H C = log 2 N C = N C H = H C 1 + … + H C N S = N S H C = N S N C String of bits total number of bits = N S N C 2 N S N C possible strings H = log 2 N = N S N C
xxx n : 8 joint entropy: summary H(X,Y)H(X,Y) H(X)H(X)H(Y)H(Y) H(X,Y) = H(X) + H(Y) independent entropies are additive X and Y are independent H(X,Y) = H(Y) X is determined by Y
xxx n : 9 example: spatial CA states (1) random : each site randomly on or off system X 4 possible states equal probabilities system Y 4 possible statesequal probabilities p (x i ) = ¼ p (y j ) = ¼ H(X) = 2 H(Y) = 2 2 bits of information in system X, 2 in Y all 16 possible states for X, Y all equally probable: p (x i, y j ) = 1/16 H(X,Y) = 4 4 bits of information in joint system X, Y H(X,Y) = H(X) + H(Y) - independent systems system X system Y system X system Y
xxx n : 10 example: spatial CA states (2) semi-random: upper sites oscillates, lower sites random system X 2 possible states equal probabilities system Y 4 possible statesequal probabilities p (x i ) = ½ p (y j ) = ¼ H(X) = 1 H(Y) = 2 1 bit of information in system X, 2 in Y all 8 possible states for (X, Y) all equally probable: p (x i, y j ) = 1/8 H(X,Y) = 3 3 bits of information in joint system X, Y H(X,Y) = H(X) + H(Y) - independent systems system X system Y
xxx n : 11 example: spatial CA states (3) semi-random: upper sites oscillate, lower random But select different X and Y: system X 4 possible states equal probabilities system Y 4 possible statesequal probabilities H(X) = 2 H(Y) = 2 2 bits of information in system X, 2 in Y only 8 possible states for (X, Y) all equally probable: p (x i, y j ) = 1/8 H(X,Y) = 3 3 bits of information in joint system X, Y H(X,Y) H(X) + H(Y) - not independent systems system X system Y
xxx n : 12 example: spatial CA states (4) Oscillating sites system X 2 possible states equal probabilities system Y 2 possible statesequal probabilities H(X) = 1 H(Y) = 1 1 bit of information in system X, 1 in Y only 2 possible states for (X, Y) equally probable: p (x i, y j ) = ½ H(X,Y) = 1 1 bit of information in joint system X, Y H(X,Y) H(X) + H(Y) - not independent systems system X system Y system X system Y
xxx n : 13 Another example: temporal CA states (1) random : each site randomly on or off system X = tile at time t 16 possible states equal probabilities system Y at t + 1: 16 possible statesequal probabilities p (x i ) = 16 p (y j ) = 16 H(X) = 4 H(Y) = 4 4 bits of information in system X, 4 in Y all 16 2 possible states for X, Y all equally probable: p (x i, y j ) = 1/256 H(X,Y) = 8 8 bits of information in joint system X, Y H(X,Y) = H(X) + H(Y) -- so independent systems
xxx n : 14 Another example: temporal CA states (2) semi-random: e.g. a rule that causes upper sites to oscillate, lower random system X = tile at time t 8 possible statesequal probabilities system Y at t possible statesequal probabilities p (x i ) = 1/8 p (y j ) = 1/8 H(X) = 3 H(Y) = 3 3 bits of information in system X, 3 in Y 8x4 possible states for (X, Y), all equally probable: p (x i, y j ) = 1/2 5 H(X,Y) = 5 5 bits of information in joint system X, Y H(X,Y) H(X) + H(Y) -- not independent systems
xxx n : 15 Another example: temporal CA states (3) Oscillating (rule) system X = tile at time t 2 possible statesequal probabilities system Y at t possible statesequal probabilities H(X) = 1 H(Y) = 1 1 bit of information in system X, 1 in Y only 2 possible states for (X, Y) equally probable: p (x i, y j ) = ½ H(X,Y) = 1 1 bit of information in joint system X, Y H(X,Y) H(X) + H(Y) - not independent systems
xxx n : 16 Useful concepts: conditional probability p(x|y) = probability that an element drawn from X has value x, given that y occurs p(x|y) = p(x) X and Y are independent p cd (H|6) = probability of tossing a head given a 6 is thrown on a die p cd (H|6) = p(H) = 1/2 useful identity: p 12 (1|6) = p 12 (1,6) /p 2 (6) = (1/36) / (5/36) = 1/5 c.f. probability of first die being 1 independent p 1st (1) = 1/6 probability of first die being 1 given total of both is 6 not independent p 12 (1|6) = 1/5
xxx n : 17 conditional entropy conditional entropy is entropy due to X, given we know Y H(X | Y) = H(X) X and Y are independent H(X | Y) = 0 X is determined completely by Y equivalently : the joint entropy of X and Y is the entropy of Y plus whatever entropy is left in X once we know Y substituting for H(X,Y) and H(Y), after a little algebra:
xxx n : 18 mutual information the mutual information in two systems is I(X;Y) = 0 X and Y are independent in terms of conditional entropy: I(X;Y) = H(X) X is determined completely by Y in terms of probabilities:
xxx n : 19 conditional entropy, mutual information H(X)H(X)H(Y)H(Y) H(X|Y) = H(X) I(X;Y) = 0 H(X|Y) = 0 I(X;Y) = H(X) H(X|Y)H(Y|X)I(X;Y) X and Y are independent X is determined by Y
xxx n : 20 example: spatial CA states (1) random : each site randomly on or off H(X) = 2 H(Y) = 2 H(X,Y) = 4 H(X,Y) = H(X) + H(Y) independent systems I(X;Y) = H(X) + H(Y) H(X,Y) = 0 no mutual information system X system Y system X system Y
xxx n : 21 example: spatial CA states (2) semi-random: upper sites oscillating, lower random H(X) = 1 H(Y) = 2 H(X,Y) = 3 H(X,Y) = H(X) + H(Y) independent systems I(X;Y) = H(X) + H(Y) H(X,Y) = 0 no mutual information system X system Y
xxx n : 22 example: spatial CA states (3) semi-random: upper sites oscillating, lower random H(X) = 2 H(Y) = 2 H(X,Y) = 3 H(X,Y) H(X) + H(Y) not independent systems I(X;Y) = H(X) + H(Y) H(X,Y) = 1 one bit of mutual information system X system Y
xxx n : 23 example: spatial CA states (4) oscillating H(X) = 1 H(Y) = 1 H(X,Y) = 1 H(X,Y) H(X) + H(Y) not independent systems I(X;Y) = H(X) + H(Y) H(X,Y) = 1 one bit of mutual information system X system Y system X system Y
xxx n : 24 Another example: temporal CA states (1) random : each site randomly on or off H(X) = 4H(Y) = 4 H(X,Y) = 8 I(X;Y) = H(X) + H(Y) H(X,Y) = 0 : no mutual information semi-random : upper sites oscillating, lower random H(X) = 3 H(Y) = 3 H(X,Y) = 5 I(X;Y) = H(X) + H(Y) H(X,Y) = 1 : one bit of mutual information oscillating H(X) = 1 H(Y) = 1 H(X,Y) = 1 I(X;Y) = H(X) + H(Y) H(X,Y) = 1 : one bit of mutual information
xxx n : 25 temporal CA states and Langton’s randomly generated 8-state, 2D CAs I(A; B) is the mutual information between a cell, A, and itself at the next time step, B relationship between Langton’s and mutual information, I I is low at extreme s (corresponding to orderly class 1 and 2 CAs, and to chaotic class 3 CAs) I is highest at intermediate (class 4 CAs) interesting behaviour depends on transmission of information H.A. Gutowitz and C.G. Langton. Methods for Designing Cellular Automata with “Interesting” Behavior.
xxx n : 26 Mutual information in RBNs N = 50 (nodes); K = 3 (inputs per node) p = average proportion of 1s in the randomly generated Boolean functions at the nodes mutual information reveals a transition H(t+1) H(t+1|t) H H I pp B. Luque, A. Ferrera. Measuring Mutual Information in Random Boolean Networks. adap-org/
xxx n : 27 low mutual information MI can be low because: 1.the correlation is low 2.or, the entropy is low
xxx n : 28 Mutual information in Ising model e.g. Ising model mutual Information (MI) between time steps 1.low temperature, low MI because low entropy 2.mid temperature, high MI (around phase transition) 3. high temperature, low MI because low correlation
xxx n : 29 Mutual information and evolution Consider information in a genome (G) in the context of information in the environment (E) same genome would be less fit in a different environment evolution increases mutual information between E an G fitter organisms exploit the environment better, so must contain more info about their environment total information in a genome can change, as genome changes size, etc H(G 1 |E) I(G3;E)I(G3;E) Env E G1G1 G2G2 G3G3 increasing fitness of genome G in Env E C. Adami. What is complexity? BioEssays, 24:1085–1094, 2002
xxx n : 30 example: emergence consider information in the high-level description (S) in the context of information in the low-level description (E) same high-level model in a different low-level environment wouldn’t be as good Env E S1S1 S2S2 S3S3 increasing fit of model S to Env E I(S3;E)I(S3;E) H(S 1 |E) A. Weeks, S. Stepney, F. A. C. Polack. Neutral Emergence: a proposal. Symposium on Complex Systems Engineering, RAND Corporation, Santa Monica, CA, USA, January 2007 modelling /engineering as increasing mutual information small H(S|E) good model large H(E|S) redundancy Could use MI as a fitness function to search for better models