Download presentation
Presentation is loading. Please wait.
Published byJustin Harmon Modified over 9 years ago
1
K2 Algorithm Presentation KDD Lab, CIS Department, KSU
Learning Bayes Networks from Data Haipeng Guo Friday, April 21, 2000 KDD Lab, CIS Department, KSU
2
Presentation Outline Bayes Networks Introduction What’s K2? Basic Model and the Score Function K2 algorithm Demo
3
Bayes Networks Introduction
A Bayes network B = (Bs, Bp) A Bayes Network structure Bs is a directed acyclic graph in which nodes represent random domain variables and arcs between nodes represent probabilistic independence. Bs is augmented by conditional probabilities, Bp, to form a Bayes Network B.
4
Bayes Networks Introduction
Example: Sprinkler - Bs of Bayes Network: the structure x1 x2 x3 x4 x5 Season Sprinkler Rain Ground_moist Ground_state
5
Bayes Networks Introduction
- Bp of Bayes Network: the conditional probability season sprinkler Rain , Ground-moist, and Ground-state
6
What’s K2? K2 is an algorithm for constructing a Bayes Network from a database of records “A Bayesian Method for the Induction of Probabilistic Networks from Data”, Gregory F. Cooper and Edward Herskovits, Machine Learning 9, 1992
7
Basic Model The problem: to find the most probable Bayes-network structure given a database D – a database of cases Z – the set of variables represented by D Bsi , Bsj – two bayes network structures containing exactly those variables that are in Z
8
Basic Model By computing such ratios for pairs of bayes network structures, we can rank order a set of structures by their posterior probabilities. Based on four assumptions, the paper introduces an efficient formula for computing P(Bs,D), let B represent an arbitrary bayes network structure containing just the variables in D
9
Computing P(Bs,D) Assumption 1 The database variables, which we denote as Z, are discrete Assumption 2 Cases occur independently, given a bayes network model Assumption 3 There are no cases that have variables with missing values Assumption 4 The density function f(Bp|Bs) is uniform. Bp is a vector whose values denotes the conditional-probability assignment associated with structure Bs
10
Computing P(Bs,D) Where D - dataset, it has m cases(records)
Z - a set of n discrete variables: (x1, …, xn) ri - a variable xi in Z has ri possible value assignment: Bs - a bayes network structure containing just the variables in Z i - each variable xi in Bs has a set of parents which we represent with a list of variables i qi - there are has unique instantiations of i wij - denote jth unique instantiation of i relative to D. Nijk - the number of cases in D in which variable xi has the value of and i is instantiated as wij. Nij -
11
Decrease the computational complexity
Three more assumptions to decrease the computational complexity to polynomial-time: <1> There is an ordering on the nodes such that if xi precedes xj, then we do not allow structures in which there is an arc from xj to xi . <2> There exists a sufficiently tight limit on the number of parents of any nodes <3> P(i xi) and P(j xj) are independent when i j.
12
K2 algorithm: a heuristic search method
Use the following functions: Where the Nijk are relative to i being the parents of xi and relative to a database D Pred(xi) = {x1, ... xi-1} It returns the set of nodes that precede xi in the node ordering
13
K2 algorithm: a heuristic search method
{Input: A set of nodes, an ordering on the nodes, an upper bound u on the number of parents a node may have, and a database D containing m cases} {Output: For each nodes, a printout of the parents of the node}
14
K2 algorithm: a heuristic search method
Procedure K2 For i:=1 to n do i = ; Pold = g(i, i ); OKToProceed := true while OKToProceed and | i |<u do let z be the node in Pred(xi)- i that maximizes g(i, i {z}); Pnew = g(i, i {z}); if Pnew > Pold then Pold := Pnew ; i :=i {z} ; else OKToProceed := false; end {while} write(“Node:”, “parents of this nodes :”, i ); end {for} end {K2}
15
Conditional probabilities
Let ijk denote the conditional probabilities P(xi =vik | i = wij )-that is, the probability that xi has value v for some k from 1 to ri , given that the parents of x , represented by , are instantiated as wij. We call ijk a network conditional probability. Let be the four assumptions. The expected value of ijk :
16
The dataset is generated from the following structure:
Demo Example Input: The dataset is generated from the following structure: x1 x2 x3
17
Demo Example Note: -- use log[g(i, i )] instead of g(i, i ) to save running time
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.