Presentation is loading. Please wait.

Presentation is loading. Please wait.

Many-Pairs Mutual Information for Adding Structure to Belief Propagation Approximations Arthur Choi and Adnan Darwiche University of California, Los Angeles.

Similar presentations


Presentation on theme: "Many-Pairs Mutual Information for Adding Structure to Belief Propagation Approximations Arthur Choi and Adnan Darwiche University of California, Los Angeles."— Presentation transcript:

1 Many-Pairs Mutual Information for Adding Structure to Belief Propagation Approximations Arthur Choi and Adnan Darwiche University of California, Los Angeles {aychoi,darwiche}@cs.ucla.edu

2 Many-Pairs Mutual Information X Y mutual information

3 d-Separation If X and Y are d-separated by Z then X and Y are independent given Z Earthquake? (E) Burglary? (B) Alarm? (A) Call? (C) Radio? (R) Are R and B d-separated by E?

4 d-Separation Each path is a pipe. Each variable is a valve. A valve is either open or closed. Are R and B d-separated by A? Earthquake? (E) Burglary? (B) Alarm? (A) Call? (C) Radio? (R)

5 d-Separation Earthquake? (E) Burglary? (B) Alarm? (A) Call? (C) Radio? (R) W Sequential Valve

6 d-Separation Earthquake? (E) Burglary? (B) Alarm? (A) Radio? (R) W Call? (C) Divergent Valve

7 d-Separation Earthquake? (E) Burglary? (B) Alarm? (A) Call? (C) Radio? (R) Convergent Valve W

8 d-Separation Earthquake? (E) Burglary? (B) Alarm? (A) Call? (C) Radio? (R) Are R and B d-separated by E? E is closed. A is closed. R and B is d-separated.

9 d-Separation E is open. A is open. R and B are not d-separated. Earthquake? (E) Burglary? (B) Alarm? (A) Call? (C) Radio? (R) Are R and B d-separated by A?

10 d-Separation Earthquake? (E) Burglary? (B) Alarm? (A) Call? (C) Radio? (R) What if E or A are “nearly” closed ? Are R and B “nearly” independent ?

11 Mutual Information and Entropy Mutual Information: non-negative; zero iff X and Y are ind. given z

12 d-Separation versus MI d-Separation hard outcomes graphical test no inference needed efficient Mutual Information soft outcomes non-graphical requires inference joint marginals on pairs of variables many-pairs MI is difficult

13 d-Separation versus MI d-Separation hard outcomes graphical test no inference needed efficient Mutual Information soft outcomes non-graphical requires inference joint marginals on pairs of variables many-pairs MI is difficult soft d-Separation (in polytrees) combine advantages of d-Separation and MI graphical test with soft outcomes

14 Mutual Information and Entropy Mutual Information: non-negative; zero iff X and Y are ind. given z Entropy: non-negative; zero iff X is fixed; maximized by uniform distribution

15 Soft d-Separation in Polytrees W Sequential Valve Theorem 1: MI(X;Y | z)  ENT(W | z) XWY

16 Soft d-Separation in Polytrees W Divergent Valve Theorem 1: MI(X;Y | z)  ENT(W | z) X W Y

17 Soft d-Separation in Polytrees N1N1 WN1N1 Convergent Valve Theorem 2: MI(X;Y | z)  MI(N 1 ;N 2 | z) XN1N1 W N2N2 Y

18 Soft d-Separation in Polytrees soft d-separation X W1W1 W2W2 W3W3 W4W4 W5W5 W6W6 Y sd-sep(X,z,Y) = 0if X and Y disconnected = MI(X;Y|z)if X and Y are adjacent = smallest valve bound, otherwise

19 Soft d-Separation in Polytrees soft d-separation X W1W1 W2W2 W3W3 W4W4 W5W5 W6W6 Y sd-sep(X,z,Y) = 0if X and Y disconnected = MI(X;Y|z)if X and Y are adjacent = smallest valve bound, otherwise MI(X;Y|z)  sd-sep(X,z,Y)

20 d-Separation vs. MI vs. soft d-sep d-Separation hard outcomes graphical test no inference needed efficient MI soft outcomes non-graphical requires inference joint marginals on pairs of variables many-pairs MI is difficult soft d-sep soft outcomes graphical test requires inference family and node marginals efficient in polytrees

21 Many-Pairs Mutual Information Mutual information can be expensive, even in polytrees Bayesian network n variables, at most w parents and s states One run of BP: O(ns w ) time single pair: MI: O(s) runs of BP, O(s  ns w ) time Pr(X,Y|z) = Pr(X|Y,z) Pr(Y|z) sd-sep: one run of BP, O(n + ns w ) time k-pairs: MI: O(ks) runs of BP, O(ks  ns w ) time sd-sep: one run of BP, O(kn + ns w ) time

22 Application: ED-BP Loopy BP marginals Exact Inference ED-BP networks: [CD06] recover edges: mutual information

23 Empirical Analysis soft d-separation versus true MI Start with polytree ED-BP approximation (equivalently, run loopy BP) Score deleted edges by sd-sep and true-MI efficiency important here Recover the highest ranking edges approximation accuracy important here

24 Empirical Analysis 110 0 0.01 0.02 0.03 edge rank (true MI) alarm true-MI 110 -10 10 -5 edge rank (true MI) alarm true-MI

25 Empirical Analysis 110 0 0.01 0.02 0.03 edge rank (true MI) alarm true-MI sd-sep 110 -10 10 -5 edge rank (true MI) alarm true-MI sd-sep

26 Empirical Analysis 010 2 4 6 x 10 -3 edges recovered average KL-error alarm random

27 Empirical Analysis 010 2 4 6 x 10 -3 edges recovered average KL-error alarm random true-MI

28 Empirical Analysis 010 2 4 6 x 10 -3 edges recovered average KL-error alarm random true-MI sd-sep

29 Empirical Analysis 1152 0 0.05 0.1 0.15 0.2 0.25 edge rank (true MI) pigs true-MI sd-sep 1152 10 -10 edge rank (true MI) pigs true-MI sd-sep

30 Empirical Analysis 0152 1 2 3 4 5 x 10 -3 edges recovered average KL-error pigs random true-MI sd-sep

31 Empirical Analysis networkmethod0%10%20%rank time# deleted# params barleyrandom115ms120ms141ms0ms37130180 MI111ms93ms2999ms sd-sep110ms125ms46ms65.84x diabetesrandom732ms1103ms1651ms0ms190461069 MI550ms674ms84604ms sd-sep957ms1639ms132ms641.99x mildewrandom238ms241ms243ms0ms12547158 MI233ms263ms6661ms sd-sep245ms323ms42ms157.26x munin1random13ms14ms22ms0ms9419466 MI12ms10ms680ms sd-sep10ms 35ms19.57x

32 Alternative Proposals & Extensions Extensions to general networks convergent valves problematic look at node-disjoint paths Extensions to undirected models entropy bounds on nodes find separating set with minimum aggregate bound optimal solution via network flows easier to generalize, bounds not as tight

33 Thanks!


Download ppt "Many-Pairs Mutual Information for Adding Structure to Belief Propagation Approximations Arthur Choi and Adnan Darwiche University of California, Los Angeles."

Similar presentations


Ads by Google