1 White Parts from: Technical overview for machine-learning researcher – slides from UAI 1999 tutorialTechnical overview for machine-learning researcher – slides from UAI 1999 tutorial Part II
2
3
4 = C t,h Example: for (ht + htthh), we get p(d|m) = 3!2!/6!
5
6 Numerical example for the network X 1 X 2 Imaginary sample sizes denoted N’ ijk Data: (true, true) and (true, false)
7
8 Used so far Desired
9 How do we assign structure and parameter priors ? Structure priors: Uniform, partial order (allowed/prohibited edges), proportional to similarity to some a priori network.
10 BDe K2K2
11
12
13
14
15
16
17
18 Example: Suppose the hyper distribution for (X 1,X 2 ) is Dir( a 00, a 01,a 10, a 11 ). So how to generate parameter priors?
19 Example: Suppose the hyper distribution for (X 1,X 2 ) is Dir( a 00, a 01,a 10, a 11 ) This determines a Dirichlet distribution for the parameters of both directed models.
20
21 Summary: Suppose the parameters for (X 1,X 2 ) are distributed Dir( a 00, a 01,a 10, a 11 ). Then, parameters for X 1 are distributed Dir(a 00 +a 01,a 10 +a 11 ). Similarly, parameters for X 2 are distributed Dir(a 00 +a 10,a 01 +a 11 ).
22 BDe score:
23
24
25
26 n Example: f(x+y) = f(x) f(y) n Solution: (ln f )`(x+y) = (ln f )`(x) n and so: (ln f )`(x) = constant n Hence: (ln f )(x) = linear function n hence: f(x) = c e ax n Assumptions: Positive everywhere, Differentiable Functional Equations Example
27 The bivariate discrete case
28 The bivariate discrete case
29 The bivariate discrete case
30 The bivariate discrete case
31
32