An Index of Data Size to Extract Decomposable Structures in LAD Hirotaka Ono Mutsunori Yagiura Toshihide Ibaraki (Kyoto Univ.)
Overview of this presentation 1.What is LAD? 2.Decomposable structures in LAD Their significance 3.An index of decomposability Based on probabilistic analysis 4.Numerical experiments 5.Conclusion
Logical Analysis of Data Input: Output: discriminant function Positive examples (the phenomenon occurs) Negative examples (the phenomenon doesn’t occur) A logical explanation of the phenomenon For a phenomenon,
Example: influenza FeverHeadacheCoughSnivelStomachache : Set of patients having influenza : Set of patients having common cold Examples of discriminant function 1=Yes, 0=No Discriminant function represents knowledge “influenza”. One kind of knowledge acquisition
Guideline to find a discriminant function Simplicity Explain the structure of the phenomenon
Decomposable function General function Decomposable structure Simplicity Explain the structure of the phenomenon
Example: concept of “square” i1110 ii1111 iii0110 iv1001 v1101 : the lengths of all edges are equal : the number of vertices is 4 : contains a right angle : the area is over 100 iii iv i ii v
Example: concept of “square” Square the lengths of all edges are equal the number of vertices is 4 contains a right angle Square rhombus the lengths of all edges are equal the number of vertices is 4
Hierarchical structures and decomposable structures Concept attribute
Hierarchical structures and decomposable structures Concept attribute Sub-Concept
Past research for decomposable structures Finding basic decomposable functions (e.g, ) for given and attribute sets case: polynomial time [Boros, et al 1994] Finding other classes (positive, Horn, and their mixtures ) of decomposable functions for and attribute set [Makino, et al 1995] Finding a (positive) decomposable functions for given ( is not given) proposing a heuristic algorithm [Ono, et al 1999]
The Number of data and decomposable structures Case 1: The size of given data is small. –Advantage: less computational time is needed to find a decomposable structure. –Disadvantage: Decomposable structures easily exist in data (because of less constraints) = Most decomposable structures are deceptive.
The Number of data and decomposable structures Case 2: The size of given data is large. –Advantage: Deceptive decomposable structures will not be found. –Disadvantage: More computational time is needed. How many data vectors should be prepared to extract real decomposable structures? Index of decomposability
Overview of our approach Assume that is the set of randomly chosen vectors from. 1.Compute the probability of an edge to appear in the conflict graph 2.Regard the conflict graph as a random graph Investigate the probability of the conflict graph to be non-bipartite Decomposability of Conflict graph of is bipartite.
Conflict graph Conflict graph Decomposability of Conflict graph of is bipartite.
Random graph – the number of vertices –Each Edge appears in with probability independently. In our analysis, is assumed to be the probability of an edge to appear in the conflict graph.
Probability of an edge to appear in conflict graph There exists a linked pair. A pair of vectors is called linked if
Define a random variable by where edge appears in the conflict graph. We want to compute. There exists a linked pair.
How to compute ? Assumptions Generation of vectors are randomly sampled from without replacement. A sampled vector is in with probability, and in with probability.
How to compute ? is easier to compute. 1. Both of 2. They have different values (i.e., 0 and 1)
Upper and lower bound on By Markov’s inequality and linearity of expectation, By the principle of inclusion and exclusion, Upper Bound Lower Bound
Approximation of
Probability of random graph to be non-bipartite : Random variable for the number of odd cycles in : Probability that is bipartite. Compute (approximation) of (Markov’s inequality) The number of sequences of vertices
Taylor series of When does hold? Upper bound:
When does hold? Lower bound when : if For sufficiently large, ( and are constant.)
Our index Probability of an edge to appear in conflict graph Threshold for a random graph to be bipartite or not
Our index If, has many deceptive decomposable structures. If tends to have no deceptive decomposable structure.
Numerical Experiments 1.Prepare non-decomposable randomly generated functions and construct 10 for each data size ( ) 2.Check their decomposability Randomly generated data Target functions are not decomposable Dimensions of data are Two types of data: are biased and not biased
Randomly generated data our index Sampling ratio (%) Ratio of decomposable pdBfs (%)
Randomly generated data Sampling ratio (%) Ratio of decomposable pdBfs (%)
Real-world data Breast Cancer in Wisconsin (a.k.a BCW) Already binarized The dimension is Comparison with randomly generated data with same size and
BCW and Randomly generated data BCWRandomly generated data Sampling ratio (%) Ratio of decomposable pdBfs (%)
Discussion and conclusion In most cases, our index is a good estimate of the threshold point of decomposability; i.e., it is useful to know how many data vectors are indispensable to extract decomposable structures. In case of and, threshold behavior of our index is not clear. We suppose that the following two violations of the assumptions cause it: The edges in the conflict graph appear independently. is small.
Future work Under some distributions of, how many data vectors are enough to extract decomposable structures. Apply this kind of approach to other classes of Boolean functions.