An Index of Data Size to Extract Decomposable Structures in LAD Hirotaka Ono Mutsunori Yagiura Toshihide Ibaraki (Kyoto University)

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

The Theory of Zeta Graphs with an Application to Random Networks Christopher Ré Stanford.
1 The Monte Carlo method. 2 (0,0) (1,1) (-1,-1) (-1,1) (1,-1) 1 Z= 1 If  X 2 +Y 2  1 0 o/w (X,Y) is a point chosen uniformly at random in a 2  2 square.
Using Probabilistic Finite Automata to Simulate Hourly series of GLOBAL RADIATION. Mora-Lopez M. Sidrach-de-Cardona Shah Jayesh Valentino Crespi CS-594.
Theoretical Program Checking Greg Bronevetsky. Background The field of Program Checking is about 13 years old. Pioneered by Manuel Blum, Hal Wasserman,
Date:2011/06/08 吳昕澧 BOA: The Bayesian Optimization Algorithm.
Support Vector Machines and Kernel Methods
Support Vector Machines (SVMs) Chapter 5 (Duda et al.)
Statistical Methods Chichang Jou Tamkang University.
1 CS 430 / INFO 430 Information Retrieval Lecture 12 Probabilistic Information Retrieval.
1 CS 430 / INFO 430 Information Retrieval Lecture 12 Probabilistic Information Retrieval.
1 Copyright M.R.K. Krishna Rao 2003 Chapter 5. Discrete Probability Everything you have learned about counting constitutes the basis for computing the.
Sampling Prepared by Dr. Manal Moussa. Sampling Prepared by Dr. Manal Moussa.
Lecture 11. Matching A set of edges which do not share a vertex is a matching. Application: Wireless Networks may consist of nodes with single radios,
Efficient Estimation of Emission Probabilities in profile HMM By Virpi Ahola et al Reviewed By Alok Datar.
Vector Space Model CS 652 Information Extraction and Integration.
Feature Selection and Error Tolerance for the Logical Analysis of Data Craig Bowles Kathryn Davidson Cornell University University of Pennsylvania Mentor:
1 CS 430 / INFO 430 Information Retrieval Lecture 10 Probabilistic Information Retrieval.
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
Processing Rate Optimization by Sequential System Floorplanning Jia Wang 1, Ping-Chih Wu 2, and Hai Zhou 1 1 Electrical Engineering & Computer Science.
Random-Number Generation. 2 Properties of Random Numbers Random Number, R i, must be independently drawn from a uniform distribution with pdf: Two important.
Super-Resolution of Remotely-Sensed Images Using a Learning-Based Approach Isabelle Bégin and Frank P. Ferrie Abstract Super-resolution addresses the problem.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Introduction to Monte Carlo Methods D.J.C. Mackay.
Modeling (Chap. 2) Modern Information Retrieval Spring 2000.
Cao et al. ICML 2010 Presented by Danushka Bollegala.
CHP400: Community Health Program - lI Research Methodology. Data analysis Hypothesis testing Statistical Inference test t-test and 22 Test of Significance.
Mathematical Processes GLE  I can recognize which symbol correlates with the correct term.  I can recall the correct definition for each mathematical.
A function from a set A to a set B is a relation that assigns to each element x in the set A exactly one element y in the set B. The set A is called the.
Random variables Petter Mostad Repetition Sample space, set theory, events, probability Conditional probability, Bayes theorem, independence,
Graphical models for part of speech tagging
Hardness of Learning Halfspaces with Noise Prasad Raghavendra Advisor Venkatesan Guruswami.
Yaomin Jin Design of Experiments Morris Method.
1 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 12 The Analysis of Categorical Data and Goodness-of-Fit Tests.
Biostatistics Class 3 Discrete Probability Distributions 2/8/2000.
Comp. Genomics Recitation 3 The statistics of database searching.
Maximum Likelihood Estimation Methods of Economic Investigation Lecture 17.
The Dirichlet Labeling Process for Functional Data Analysis XuanLong Nguyen & Alan E. Gelfand Duke University Machine Learning Group Presented by Lu Ren.
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
FUNCTIONS AND MODELS 1. The fundamental objects that we deal with in calculus are functions.
An Index of Data Size to Extract Decomposable Structures in LAD Hirotaka Ono Mutsunori Yagiura Toshihide Ibaraki (Kyoto University)
Eurostat Weighting and Estimation. Presented by Loredana Di Consiglio Istituto Nazionale di Statistica, ISTAT.
Stats Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.
MCMC in structure space MCMC in order space.
Learning Target Students will be able to: Graph functions given a limited domain and Graph functions given a domain of all real numbers.
Sampling and estimation Petter Mostad
An Index of Data Size to Extract Decomposable Structures in LAD Hirotaka Ono Mutsunori Yagiura Toshihide Ibaraki (Kyoto Univ.)
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Section 6-4 Sampling Distributions and Estimators.
Introduction The rate of change is a ratio that describes how much one quantity changes with respect to the change in another quantity. For a function,
CS 8751 ML & KDDComputational Learning Theory1 Notions of interest: efficiency, accuracy, complexity Probably, Approximately Correct (PAC) Learning Agnostic.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Nonlinear Knowledge in Kernel Approximation Olvi Mangasarian UW Madison & UCSD La Jolla Edward Wild UW Madison.
R ANDOM N UMBER G ENERATORS Modeling and Simulation CS
MACHINE LEARNING 3. Supervised Learning. Learning a Class from Examples Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Machine Learning Chapter 7. Computational Learning Theory Tom M. Mitchell.
5-Minute Check on Chapter 5 Click the mouse button or press the Space Bar to display the answers. 1.What can help detect “cause-and-effect” relationships?
Sec  Determine whether relations between two variables are functions; Use function notation.  Find the domains of functions.  Use functions to.
1 © 2008 Brooks/Cole, a division of Thomson Learning, Inc Tests for Homogeneity and Independence in a Two-Way Table Data resulting from observations.
Unconstrained Submodular Maximization Moran Feldman The Open University of Israel Based On Maximizing Non-monotone Submodular Functions. Uriel Feige, Vahab.
Support Vector Machines (SVMs) Chapter 5 (Duda et al.) CS479/679 Pattern Recognition Dr. George Bebis.
1.7 Combinations of Functions; Composite Functions
A. The parent graph is translated up 0.5 units.
Applied Discrete Mathematics Week 7: Probability Theory
Warm-up (10 min.) I. Factor the following expressions completely over the real numbers. 3x3 – 15x2 + 18x x4 + x2 – 20 II. Solve algebraically and graphically.
Probabilistic Data Management
2.1 – Represent Relations and Functions.
Warm-up (10 min.) I. Factor the following expressions completely over the real numbers. 3x3 – 15x2 + 18x x4 + x2 – 20 II. Solve algebraically and graphically.
1.2 Functions and Their Properties
Warm Up Given y = –x² – x + 2 and the x-value, find the y-value in each… 1. x = –3, y = ____ 2. x = 0, y = ____ 3. x = 1, y = ____ –4 – −3 2 –
Section 11.7 Probability.
Presentation transcript:

An Index of Data Size to Extract Decomposable Structures in LAD Hirotaka Ono Mutsunori Yagiura Toshihide Ibaraki (Kyoto University)

Overview 1.Overview of LAD 2.Decomposability -Importance & motivation 3.An index of decomposability -#data vectors needed to extract reliable decomposable structures -Based on probabilistic analyses 4.Numerical experiments 5.Conclusion

Logical Analysis of Data (LAD) Input: Output: discriminant function T: positive examples (the phenomenon occurs) F: negative examples (the phenomenon does not occur) f(x): a logical explanation of the phenomenon For a phenomenon

Example: influenza FeverHeadacheCoughSnivelStomachache : Set of patients having influenza : Set of patients having common cold An example of discriminant functions: 1=Yes, 0=No Discriminant function f (x) represents knowledge “influenza”. One form of knowledge acquisition

Guideline to find a discriminant function Simplicity Explain the structure of the phenomenon {0,1} n space Positive example Negative example We focus on decomposability.

x1x1 x2x2 x3x3 x4x4 x5x5 h(x[S 1 ]) T F Decomposability S 0  {1, 4, 5} h(x[S 1 ])  x 2  x 3 f (x)  x 1 x 2 x 4  x 1 x 3 x 4    x 1 x 4 h(x[S 1 ]) decomposable! S 1  {2, 3} f is decomposable  f (x)  g(x[S 0 ], h(x[S 1 ])) (T, F) is decomposable   decomposable discriminant f

Another example: concept of “square” Square f (x 1, x 2, x 3 ) -x 2 the lengths of all edges are equal -x 3 the number of vertices is 4 -x 1 contains a right angle Square f (x 1, x 2, x 3 ) = g(x 1,h(x 2,x 3 )) - h rhombus - x 2 the lengths of all edges are equal - x 3 the number of vertices is 4

The number of data and decomposable structures Case 1: The size of given data is small. –Advantage: Less computational time is needed to find a decomposable structure. –Disadvantage: Decomposable structures easily exist in data (because of less constraints) = Most decomposable structures are deceptive.

The number of data and decomposable structures Case 2: The size of given data is large. –Advantage: Deceptive decomposable structures will not be found. –Disadvantage: More computational time is needed. How many data vectors should be prepared to extract real decomposable structures? Index of decomposability

(T, F) is decomposable conflict graph of (T, F) is bipartite (Boros et al.1994) Overview of our approach Assume that (T, F) is the set of l randomly chosen vectors from {0, 1} n. 1.Compute the probability of an edge to appear in the conflict graph 2.Regard the conflict graph as a random graph Investigate the probability of the conflict graph to be non-bipartite

Conflict graph Conflict graph (T, F) is decomposable conflict graph of (T, F) is bipartite

Probability of an edge to appear in conflict graph There exists a linked pair. A pair of vectors is called linked if

Define a random variable by where edge appears in the conflict graph. We want to compute. There exists a linked pair.

How to compute is easier to compute. 1. Both of 2. They have different values (i.e., 0 and 1) L=|T|+|F| p:q=|T|:|F| M=2 n m=2 |S 0 |

Approximation of By Inclusion and Exclusion Principle,

Random graph In our analysis, is assumed to be the probability of an edge to appear in the conflict graph. Random graph G(N, r) - N: the number of vertices - Each edge e  (u, v) appears in G(N, r) with probability r independently

Probability of a random graph to be non-bipartite Y odd : Random variable representing the number of odd cycles in G(N, r) Pr(Y odd  1): Probability that G(N, r) is not bipartite Markov’s inequality The number of sequences of k vertices For sufficiently large N,

Assumptions Our index Probability of an edge to appear in conflict graph Threshold for a random graph to be bipartite or not - probabilities p and q are given by p : q  |T| : |F| - conflict graph is a random graph (|S 0 |  |S 1 |  n)

Our index If, tends to have many deceptive decomposable structures. If tends to have no deceptive decomposable structure.

Numerical Experiments 1.Prepare non-decomposable randomly generated functions and construct 10 for each data size ( ) 2.Check their decomposability Randomly generated data Target functions are not decomposable Dimensions of data are n  10, 20 Two types of data: are biased and not biased

Randomly generated data our index Sampling ratio (%) Ratio of decomposable (T, F)s (%)

Randomly generated data Sampling ratio (%) Ratio of decomposable (T, F)s (%) our index

Breast Cancer in Wisconsin (a.k.a BCW) Already binarized The dimension is n  11 Comparison with randomly generated data with the same n, p and q Real-world data

BCW and randomly generated data BCWRandomly generated data Sampling ratio (%) Ratio of decomposable (T, F)s (%) our index

Discussion and conclusion An index to extract reliable decomposable structures Computational experiments on random & real-world data - proposed index is a good estimate - |S 0 |  1 or |S 1 |  2  threshold behavior is not clear

Future work Analyses on sharpness of the threshold behavior: to know sufficient |T| + |F| to extract reliable decomposable structures Apply similar approach to other classes of Boolean functions |T|  |F| #decomposable structures proposed index we want to estimate