Presentation is loading. Please wait.

Presentation is loading. Please wait.

Part II White Parts from: Technical overview for machine-learning researcher – slides from UAI 1999 tutorial.

Similar presentations


Presentation on theme: "Part II White Parts from: Technical overview for machine-learning researcher – slides from UAI 1999 tutorial."— Presentation transcript:

1 Part II White Parts from: Technical overview for machine-learning researcher – slides from UAI 1999 tutorial

2 Reminder on Dirichlet distribution
Definition: The Dirichlet density function with parameters α1,…, αm, where αi are real numbers > 0, is 𝑝 𝜃1,…,𝜃𝑚 =𝜃 Γ( 𝑖=1 𝑚 αi ) 𝑖=1 𝑚 Γ(αi ) 𝑖=1 𝑚 𝜃𝑖 αi−1 where 0 ≤𝜃𝑖≤1 and 𝑖=1 𝑚 𝜃i = 1 The constant is determined such that 0 1 𝑝 𝜃 𝑑𝜃=1 . Notably, Γ(n)= (n-1)! Example: The Dirichlet imaginary sample size for two binary variable Markov network is say N’=12 and each option is seen a fraction of say (1/4, 1/6, 1/4, 1/3) times, namely α=(3,2,3,4). X Y

3 Dir(3,2,3,4) Dir(5,7) Dir(3,2) Dir(3, 4) Dir(6,6) Dir(3,3) Dir(2,4)
X Y Dir(3,2,3,4) X Y Dir(5,7) Dir(3,2) Dir(3, 4) X Y Dir(6,6) Dir(3,3) Dir(2,4) 𝐷𝑎𝑡𝑎={ 𝑥,𝑦 ,(𝑥, 𝑦 )}

4 𝐸𝑞𝑢𝑖𝑣𝑎𝑙𝑒𝑛𝑡 𝑚𝑜𝑑𝑒𝑙𝑠 𝑠ℎ𝑜𝑢𝑙𝑑 𝑔𝑖𝑣𝑒 𝑟𝑖𝑠𝑒 𝑡𝑜 𝑡ℎ𝑒 𝑠𝑎𝑚𝑒 𝑠𝑐𝑜𝑟𝑒.
X Y Dir(3,2,3,4) X Y Dir(5,7) Dir(3,2) Dir(3, 4) X Y Dir(6,6) Dir(3,3) Dir(2,4) 𝐷𝑎𝑡𝑎={ 𝑥,𝑦 ,(𝑥, 𝑦 )} 𝐸𝑞𝑢𝑖𝑣𝑎𝑙𝑒𝑛𝑡 𝑚𝑜𝑑𝑒𝑙𝑠 𝑠ℎ𝑜𝑢𝑙𝑑 𝑔𝑖𝑣𝑒 𝑟𝑖𝑠𝑒 𝑡𝑜 𝑡ℎ𝑒 𝑠𝑎𝑚𝑒 𝑠𝑐𝑜𝑟𝑒.

5

6

7

8 So how to generate parameter priors?
Example: Suppose the hyper distribution for (X1,X2) is Dir( a00, a01 ,a10, a11).

9 Example: Suppose the hyper distribution for (X1,X2) is Dir(α00, α01 , α10, α11)
This determines a Dirichlet distribution for the parameters of both directed models.

10

11 Summary: Suppose the parameters for (X1,X2) are distributed Dir(α00, α01 , α10, α11).
Then, parameters for X1 are distributed Dir(α00+ α01 , α10+ α11). Similarly, parameters for X2 are distributed Dir(α00+ α10 , α01+ α11).

12 BDe score:

13

14

15 Functional Equations Example
Example: f(x+y) = f(x) f(y) Solution: (ln f )`(x+y) = (ln f )`(x) and so: (ln f )`(x) = constant Hence: (ln f )(x) = linear function hence: f(x) = c eax Assumptions: Positive everywhere, Differentiable

16 The bivariate discrete case

17 The bivariate discrete case

18 The bivariate discrete case

19 The bivariate discrete case

20


Download ppt "Part II White Parts from: Technical overview for machine-learning researcher – slides from UAI 1999 tutorial."

Similar presentations


Ads by Google