A Cooperative Coevolutionary Genetic Algorithm for Learning Bayesian Network Structures Arthur Carvalho
Outline Bayesian Networks CCGA Experiments Conclusion
Bayesian Networks AI technique –Diagnosis, predictions, modelling knowledge Graphical model –Represents a joint distribution over a set of random variables –Exploits conditional independence –Concise, natural representation
Bayesian Networks X1X1 X1X1 X2X2 X2X2 X3X3 X3X3 TT TF FT FF
X1X1 X1X1 X2X2 X2X2 X3X3 X3X3 T F
Directed acyclic graph (DAG) –Nodes: random variables –Edges: direct influence of one variable on another Each node is associated with a conditional probability distribution (CPD)
Bayesian Networks Learning the structure of the network (DAG) –Structure learning problem Learning parameters that define the CPDs –Parameter estimation –Maximum Likelihood estimation
Bayesian Networks Structure learning problem in fully observable datasets –Find a DAG that maximizes P(DAG | Data) –[Cooper & Herskovits, 92]
Bayesian Networks NP-Hard [Chickering et al, 1994] –The number of possible structures is superexponential in the number of nodes [Robinson, 1977] –For a network with n nodes, the number of different structures is:
Outline Bayesian Networks CCGA Experiments Conclusion
CCGA Structure learning task can be decomposed into two dependent subtasks –To find an optimal ordering of the nodes –To find an optimal connectivity matrix
CCGA D D A A B B C C DABC 110 D
D D A A B B C C DABC 110 D
D D A A B B C C DABC 110 D
D D A A B B C C DABC 110 D
D D A A B B C C DABC A
D D A A B B C C DABC A
D D A A B B C C DABC A
D D A A B B C C DABC B 1
D D A A B B C C D110A01B1C DABC
Two subpopulations –Binary (edges) –Permutation (nodes) Cooperative Coevolutionary Genetic Algorithm (CCGA) –Each subpopulation is coevolve using a canonical GA
CCGA Evaluating individual species –Each subpopulation member is combined with both the best known individual and a random individual from the other subpopulation –The fitness function is applied to the two resulting solutions The highest value is the fitness CCGA-2 [Potter & De Jong, 1994]
CCGA OperatorBinaryPermutation SelectionTournament selection CrossoverTwo-point crossoverCycle crossover MutationBit-flip mutationSwap mutation ReplacementPreserve the best solution
Outline Bayesian Networks CCGA Experiments Conclusion
Experiments Setup –K2 algorithm [Cooper & Herskovits, 1992] –Alarm network 37 nodes and 46 edges –Insurance network 27 nodes and 52 edges –Three datasets 1000, 3000, and 5000 instances –100 executions
Experiments Parameters: ParameterValue Generations250 Population size100 Probability crossover0.6 Probability mutation: binary population 1 / max # of edges Probability mutation: permutation population0.5
Experiments Alarm network DatasetAlgorithmAverage Standard Deviation p-value Alarm 1000 −11, CCGA−12, K2−12, Alarm 3000 −33, CCGA−35, K2−35, Alarm 5000 −55, CCGA−57, < K2−57,
Experiments Insurance network DatasetAlgorithmAverage Standard Deviation p-value Insurance 1000 −15, CCGA−15, < K2−16, Insurance 3000 −43, CCGA−45, < K2−45, Insurance 5000 −72, CCGA−74, < K2−75, ,249.83
Outline Bayesian Networks CCGA Experiments Conclusion
New algorithm to solve the structure learning problem –Novel representation –Good performance Future work –Incomplete datasets –Graph-related problems
Thank you! Source code and datasets available at: Arthur Carvalho