Download presentation
Presentation is loading. Please wait.
Published byDinah Harris Modified over 6 years ago
1
PROBABILISTIC AND LOGIC APPROACHES TO MACHINE LEARNING AND DATA MINING
Marek Perkowski Portland State University
2
Essence of logic synthesis approach to learning
3
Example of Logical Synthesis
John Dave Mark Jim Alan Nick Mate Robert
4
Good guys Bad guys A - size of hair B - size of nose C - size of beard
Dave Jim John Mark Good guys Alan Nick Mate Robert Bad guys A - size of hair C - size of beard D - color of eyes B - size of nose
5
- 1 – 00 01 11 10 Good guys A - size of hair B - size of nose
Mark John Dave Jim A’ B’CD A’ B’CD A’ BCD A’ BCD’ C - size of beard D - color of eyes A - size of hair B - size of nose 00 01 11 10 - 1 – AB CD
6
- 1 – 00 01 11 10 Bad guys A - size of hair B - size of nose
Alan Nick Mate Robert Bad guys A’ B’C’D ABCD A’ BC’D’ AB’C’D C - size of beard D - color of eyes A - size of hair B - size of nose 00 01 11 10 - 1 – AB CD A’C
7
- 1 – 00 01 11 10 Generalization 2: Generalization 1:
Bald guys with beards are good Generalization 2: All other guys are no good C - size of beard D - color of eyes A - size of hair B - size of nose 00 01 11 10 - 1 – AB CD A’C
8
SOP (DNF) approach to learning
9
Sum of Products AND gates, followed by an OR gate that produces the output. (Also, use Inverters as needed.) There are many algorithms to minimize SOP They were created either in ML community ot Logic Synthesis community. We will illustrate three different algorithms.
10
SOP minimization based on graph coloring
Method 1 SOP minimization based on graph coloring
11
Reduction of SOP (DNF) Machine Learning to graph coloring
SOP through Graph Coloring In previous example there were 4 binary variables. Here there are two variables , each with 4 values. We encode every group or minterm using the encoding as in the right We check for every two groups if they can be combined. If they can be combined, the combined group does not cover zeros. If the combined group covers zeros, the groups cannot be combined. Let us try to combine a1 and b1. We do bitwise OR. The combined group does not cover zeros.
12
Reduction of SOP (DNF) Machine Learning to graph coloring
SOP through Graph Coloring Let us try to combine a2 and b2. We do bitwise OR. The combined group covers zeros. So groups a2 and b2 are not compatible. For every incompatible nodes in the graph there is an edge.
13
Reduction of SOP (DNF) Machine Learning to graph coloring
SOP through Graph Coloring Based on incompatibility of groups we create the INCOMPATIBILITY GRAPH. Every two incompatible nodes (for incompatible groups) there is an edge. We color graph with the minimum number of colors. The minimum number of colors is called the chromatic number. We combine the nodes that have the same color.
14
The minimum coloring corresponds to the minimum number of combined groups in the final solution.
These groups are usually products, but they may be also of the form PRODUCT1 * (PRODUCT2)’
15
SOP minimization based on set covering with primes
Method 2 SOP minimization based on set covering with primes
16
SOP through Set Covering
Find all prime implicants of the function. Create a table with columns being true minterms and rows being prime implicants. This is called the covering problem. You want to find the smallest subset of rows that covers all columns There are many algorithms for this problem. Some use BDDs, some SAT, some matrices. The same method can be used for Boolean minimization, test generation to cover all faults with minimum number of tests and to select best position of robots guarding a building from terrorists.
17
Columns correspond to minterms with value 1
SOP through Set Covering T0 T1 T2 T3 Rows correspond to prime implicants T0 and T2 is not a solution because column b0 is not covered. T0, T2 and T3 is a solution.
18
Method 3 SOP minimization based on set sequential finding of secondary essential primes
19
Machine Learning SOP through sequential finding of essential and secondary essential primes
1. Find essential primes Essential prime 00 01 11 10 1 - AB CD Essential prime
20
Machine Learning SOP through sequential finding of essential and secondary essential primes
2. remove essential primes Secondary Essential prime 00 01 11 10 - 1 AB CD orange is redundant prime Yellow is redundant prime
21
- 1 00 01 11 10 3. ITERATE Essential prime Secondary essential prime
1 AB CD Secondary essential prime ESSENTIAL prime The solution are essential primes and secondary essential primes of all levels. If algorithm does not terminate, make random choice and iterate OR use another algorithm.
22
Multivalued relations approach to learning
23
Short Introduction: multiple-valued logic
Signals can have values from some set, for instance {0,1,2}, or {0,1,2,3} {0,1} - binary logic (a special case) {0,1,2} - a ternary logic {0,1,2,3} - a quaternary logic, etc 1 MIN MAX Minimal value 1 2 2 2 3 Maximal value 3 3
24
Functional Decomposition
Evaluates the data function and attempts to decompose into simpler functions. F(X) = H( G(B), A ), X = A B X A - free set B - bound set if A B = , it is disjoint decomposition if A B , it is non-disjoint decomposition
25
Pros and cons In generating the final combinational network, BDD decomposition, based on multiplexers, and SOP decomposition, trade flexibility in circuit topology for time efficiency Generalized functional decomposition sacrifices speed for a higher likelihood of minimizing the complexity of the final network 6/24/2018
26
A Standard Map of function ‘z’
Bound Set a b \ c Columns 0 and 1 and columns 0 and 2 are compatible column compatibility = 2 Free Set z
27
Principle of finding patterns
We have a tabular representation of data We want to find patterns In this case we are looking for patterns in columns. Columns have the same pattern if the symbols in each row can be combined. We say that these columns are COMPATIBLE. If in one row we have 0 and 0 , 1 and 1, 0 and -, 1 and - , or – and – then the columns are compatible. If we have a 0 and a relation 0,1 then the columns are compatible as one can select 0 from the choice of 0,1.
28
Decomposition of Multi-Valued Relations
F(X) = H( G(B), A ), X = A B A X Relation Relation Relation B if A B = , it is disjoint decomposition if A B , it is non-disjoint decomposition
29
Forming a CCG from a K-Map
Bound Set Free Set a b \ c Columns 0 and 1 and columns 0 and 2 are compatible column compatibility index = 2 C1 C2 C0 Column Compatibility Graph z
30
Column Incompatibility Graph
Forming a CIG from a K-Map z a b \ c Columns 1 and 2 are incompatible chromatic number = 2 C1 C2 C0 Column Incompatibility Graph
31
Column Compatibility Graph Column Incompatibility Graph
CCG and CIG are complementary Graph coloring graph multi-coloring Maximal clique covering clique partitioning C1 C2 C0 C1 C2 C0 Column Compatibility Graph Column Incompatibility Graph
32
clique partitioning example.
33
Maximal clique covering example.
34
g = a high pass filter whose acceptance threshold begins at
Map of relation G G \ c G \ c After induction From CIG g = a high pass filter whose acceptance threshold begins at c > 1
35
The Meaning of Attributes
36
Attributes Static Facial features Gestures Objects to grasp
Objects to avoid Symptoms of illness View of body cells Crystallization parameters of liquids Dynamic Changes in stock market Changes in facial features – facial gestures. Change of object’s view when robot approaches it. Dynamical change of body part in motion. Changes of moles on the skin. Changing symptoms of an illness.
37
Static p1 p2 p3 p4 p5 p6 p7 p8 Dynamic t0 p1 p2 p3 p4 p5 p6 p7 p8 t1 p1 p2 p3 p4 p5 p6 p7 p8 t2 p1 p2 p3 p4 p5 p6 p7 p8 Three vectors in time represented as one long vector for Machine Learning p1 p2 p3 p4 p5 p6 p7 p8 p1 p2 p3 p4 p5 p6 p7 p8 p1 p2 p3 p4 p5 p6 p7 p8 …… p1 p2 p3 p4 p5 p6 p7 p8 Attributes in time t0
38
Representation Models for Logic Based Machine Learning
39
Types of Logical Synthesis
Sum of Products Decision Trees Decision Diagrams Functional Decomposition The method we are using
40
Binary Decision Diagrams
There are many types of Decision Trees and many generalizations of them, used in logic and in ML
41
Decision Diagrams A Decision diagram breaks down a Karnaugh map into set of decision trees. A decision diagram ends when all of branches have a yes, no, or do not care solution. Example Karnaugh Map This diagram can become quite complex if the data is spread out as in the following example.
42
Decision Tree for Example Karnaugh Map
43
BDD Representation of function
1 00 01 11 10 1 - AB CD Incompletely specified function 6/24/2018
44
BDD Representation of function
1 00 01 11 10 1 CD AB Completely specified function The problem is how to find minimum tree or decision diagram for your given data 6/24/2018
45
Absolutely Minimum Background on Binary Decision Diagrams (BDD)
BDDs are based on recursive Shannon expansion F = x Fx + x’ Fx’ Compact data structure for Boolean logic can represents sets of objects (states) encoded as Boolean functions Canonical representation reduced ordered BDDs (ROBDD) are canonical essential for simulation, analysis, synthesis and verification
46
Other expansions, other trees, other diagrams.
The standard Decision Tree is based on Shannon Expansion. F = x Fx + x’ Fx’ This is the same concept as this F Shannon node for variable x x’ x All examples Fx’ Fx WIND WIND=WEAK WIND=STRONG Data Separation in ML is the same as Shannon Expansion in Logic All examples for which WIND was Weak All examples for which WIND was STRONG
47
Absolutely Minimum Background on Binary Decision Diagrams (BDD) and Kronecker Functional Decision Diagrams BDDs are based on recursive Shannon expansion F = x Fx + x’ Fx’ Compact data structure for Boolean logic can represents sets of objects (states) encoded as Boolean functions Canonical representation reduced ordered BDDs (ROBDD) are canonical essential for simulation, analysis, synthesis and verification Positive cofactor of F with respect to variable x Negative cofactor of F with respect to variable x
48
BDD Construction Typically done using APPLY operator Reduction rules
remove duplicate terminals merge duplicate nodes (isomorphic subgraphs) remove redundant nodes Redundant nodes: nodes with identical children a b f 1 b c
49
BDD Construction – your first BDD
Construction of a Reduced Ordered BDD 1 edge 0 edge f = ac + bc 1 a b c f a b c f Truth table Decision tree
50
BDD Construction – cont’d
1 a b c f f 1 a b c f = (a+b)c 1 a b c 1. Remove duplicate terminals 2. Merge duplicate nodes 3. Remove redundant nodes
51
ROBDD MULTIPLEXOR CIRCUIT MUX F a b c d 1 F a E T b c E T E T What is decision Diagram in Theory and ML is a logic circuit from multiplexers in logic design d d E E T T 1
52
Kronecker Decision Diagrams
There are many types of Decision Trees and many generalizations of them, used in logic and in ML Kronecker Decision Diagrams are generalization of BDDs
53
Decomposition types Decomposition types are associated to the variables in Xn with the help of a decomposition type list (DTL) d:=(d1,…,dn) where di { S, pD, nD}
54
KFDD IN KF trees and KFDD we can have any variable and any of the three expansions in any level Definition
55
F = a’b Å ac F = x Fx + x’ Fx’ a b c A’ f = Å Shannon cell Dipal cell
representation with reversible gates The nodes of KFDD or KF tree can be interpreted as logic gates as shown below 6/24/2018 F = a’b Å ac F = x Fx + x’ Fx’ Positive cofactor of F with respect to variable x Negative cofactor of F with respect to variable x
56
Three different reductions types
Type I : Each node in a DD is a candidate for the application g f g f This applies to BDD and KDD xi xi xi
57
Three different reductions types (cont’d)
xi This applies to BDD and KDD xj xj g f g f
58
Three different reductions types (cont’d)
Type D xi This is used in functional diagrams such as KFDDs xj xj f f g g
59
Example for OKFDD Shannon 1 x1 X1 * X2 *1 * X4 Shannon F=
60
Example for OKFDD (cont’d)
This diagram below explains the expansions from previous slide X1 S-node X2 pD-node X3 nD-node
61
How to combine trees with any other Classifiers
62
It can be Shannon Tree or KFDD or any tree
You have a tree on top. It can be Shannon Tree or KFDD or any tree Any cofactor of variables on top of tree ANY REMINDER LOGIC
63
Overview of data mining
64
What is Data Mining? Databases with millions of records and thousands of fields are now common in business, medicine, engineering, and the sciences. To extract useful information from such data sets is an important practical problem. Data Mining is the study of methods to find useful information from the database and use data to make predictions about the people or events the data was developed from. Classification for small error Understanding data (patterns, rules, statistics)
65
Some Examples of Data Mining
1) Stock Market Predictions 2) Large companies tracking sales 3) Military and intelligence applications
66
Data Mining in Epidemiology
Epidemiologists track the spread of infectious disease and try to determines the diseases original source Often times Epidemiologist only have an initial suspicions about what is causing an illness. They interview people to find out what those people that got sick have in common. Currently they have to sort through this data by hand to try and determine the initial source of the disease. A data mining application would speed up this process and allow them to quickly track the source of an infectious diseases
67
Types of Data Mining 1) Neural Nets
Data Mining applications use, among others, three methods to process data 1) Neural Nets 2) Statistical (Probabilistic) Analysis The method we will discuss here 3) Logical Methods The method we will discuss here
68
Understanding of the underlying area like medicine or finance
DATA MINING Formal mathematics and programming MACHINE LEARNING
69
Conclusions
70
Logic Synthesis and ML classifier design are similar
Logic Synthesis and ML classifier design are similar. In ML we have continuous or MV data and many don’t cares. Also we design for accuracy and interpretability. Occam Razor is similar SOP is a very good method, not recognized sufficiently by ML community. We shown three algorithms to deal with SOP. There are many more. The compatibility of columns is important to find patterns in data. We will use this method. It can be reduced to clique partitioning, clique covering and graph coloring. Decision Trees as shown in the previous lectures are only some particular examples of hierarchically decomposed structures Other structures are Decision Diagrams and Functional Decompositions. Kronecker Diagrams and Trees generalize Decision Diagrams and Trees.
71
Questions and Problems
What is the principle of logic-based representation of data for Machine Learning. Give example of learning representations based on logic. Static versus dynamic attributes. What is the difference of decision trees and decision diagrams? How to generalize Boolean concepts to MV concepts in Machine Learning? Binary versus ternary Ashenhurst Decomposition What is Data Mining and how it can be used. Give your own example of Data Mining using known to you Machine Learning methods. What are bound and free sets in decomposition? What is disjoint versus non-disjoint decomposition? Can decomposition be combined with other learning methods? How?
72
Questions and Problems
Explain Graph coloring approach to SOP. Explain essential prime approach to SOP Explain set covering (clique covering) approach to SOP. Compare Covering and coloring ideas. What is maximum Clique? What is maximum independent set? Apply Shannon Expansion to function from any Kmap in these slides. Select any variable you want. Apply Positive Davio Expansion to any function above. Apply Negative Davio Expansion to any function above. What is the difference of clique partitioning and clique covering. What is a relation of clique covering and SOP? What is the difference between Decision Tree and Decision Diagram? How to create a KFDD for function with don’t cares? DIFFICULT
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.