presented by Zbigniew Ras UNC-Charlotte, Computer Science
Action Rules [Z. Ras & A. Wieczorkowska]
Decision table Any information system of the form S = (U, AFl ASt {d}), where d AFl ASt is a distinguished attribute called decision. The elements of ASt are called stable conditions the elements of AFl {d} are called flexible conditions Example of action rule: [ (b1, v1 w1) (b2, v2 w2) … (bp, vp wp)](x) [(d, k1 k2)](x) Assumption: (i)[(1 i p) (bi AFl)]
Action Rules X a b c d x1 S L x2 R 1 x3 x4 x5 2 P x6 x7 H
S L x2 R 1 x3 x4 x5 2 P x6 x7 H {a, c} - stable attributes, {b,d} - flexible attributes, d - decision attribute. Decision Table (r1, r2)- action rule: [(b, P S)](x) [(d, L H)](x) Rules discovered: r1 = [ (b, P) (d, L)] r2 = [(a, 2) ^ (b, S) (d, H)] Notation: (r2)={a,b}, (r2)=d.
E-Action Rules [L.-S. Tsay & Z. Ras]
St Flex St Flex St Flex Decision A B C D E F G a1 * b1 * c1 * d g1 a1 * b2 * e2 f g2 E-Action rule: (B, b1 b2) ^ (E = e2) ^ (F, f2) (G, g1 g2) What about support & confidence of action rules?
[Object-Based] Support of Action Rules
Action rule r: [ (b1, v1 w1) (b2, v2 w2) … (bp, vp wp)](x) [(d, k1 k2)] (x) Object x certainly supports rule r in S = (X, A) if: 1) (i p)[ bi(x) = vi ] and d(x) = k1 2) (y X)(i p)[ bi(y) = wi ] and d(y) = k2 3) (b A – [{bi : 1 i p} {d}])[ b(x) = b(y) ] a1 a2 b1 b2 a3 a4 d x n1 n2 v1 v2 u1 u2 k1 y w1 w2 k2 CSupS(r) = card{x: x certainly supports r in S}
[Object-Based] Support of Action Rules
Action rule r: [(b1, v1 w1) (b2, v2 w2) … (bp, vp wp)](x) [(d, k1 k2)] (x) Object x possibly supports rule r in S = (X, A) if: 1) (i p)[ bi(x) = vi ] and d(x) = k1 2) (y X)(i p)[ bi(y) = wi ] and d(y) = k2 3) (c ASt)[c(x) = c(y)] a1 a2 b1 b2 c1 c2 d x n1 n2 v1 v2 u1 u2 k1 y m1 m2 w1 w2 k2 PSupS(r) = card{x: x possibly supports r in S}
[Rule-Based] Support of Action Rules
Action rule r: [(b1, v1 w1) (b2, v2 w2) … (bp, vp wp)](x) [(d, k1 k2)] (x) Object xX supports rule r in S = (X, A), if there are two rules r1, r2 extracted from S and there exists object y X satisfying two conditions: (i p)[[ bi (r1)] [bi(x) = vi]] (r1)=d d(x) = k1 (i p)[[ bi (r2)] [bi(y) = wi]] (r2)=d d(y) = k2 [[b ASt] [b (x) = b(y)] ] (r2) = {a1,a2,b1,b2,c1,c2} Confidence: ConfS(r) = RSupS(r)/SupS(r1) a1 a2 b1 b2 c1 c2 d x n1 n2 v1 v2 u1 u2 k1 y w1 w2 k2 (r1) RSupS(r) = card{x: x supports r in S} (r2)
Cost of Action Rule [Ras & Tzacheva]
Assumption: S = (X, A, V) is information system, Y X. Attribute b A is flexible in S and b1, b2 Vb. By S(Y, b1, b2) we mean a number from (0, +] which describes the average predicted cost of approved action associated with a possible re-classification of qualifying objects in Y from class b1 to b2. Object x Y qualifies for re-classification from b1 to b2, if b(x) = b1. S(Y, b1, b2) = +, if there is no action approved which is required for a possible re-classification of qualifying objects in Y from class b1 to b2 If Y is uniquely defined, we often write S(b1, b2) instead of S(Y, b1, b2).
Cost of Action Rule Action rule r:
[(b1, v1→ w1) (b2, v2→ w2) … ( bp, vp→ wp)](x) (d, k1→ k2)(x) The cost of r in S: costS(r) = {S(vi , wi) : 1 i p} Action rule r is feasible in S, if costS(r) < S(k1 , k2). For any feasible action rule r, the cost of the conditional part of r is lower than the cost of its decision part.
Extension: Cost of Action Rule
RS[(d, k1 → k2)] denotes set of feasible action rules in S having term (d, k1 → k2) as their decision part. Assumption: Among action rules in RS[(d, k1 → k2)] the user identifies rule r of minimal cost value. But that cost value may still be too high to get his approval for implementation of r. The cost of r might be high because of the high cost of one of its sub-terms (bj, vj → wj). In such case, we may look for an action rule in RS[(bj, vj → wj)] of minimal cost value needed to re-classify qualifying objects from vj to wj. Rules short on left side. It was observed such rules were not interesting – active mining.
Cost of Action Rule Example:
r = [(b1, v1 → w1) … (bj, vj → wj) … ( bp, vp → wp)](x) (d, k1 → k2)(x) In RS[(bj, vj → wj)] we find r1 = [(bj1, vj1 → wj1) (bj2, vj2 → wj2) … ( bjq, vjq → wjq)](x) (bj, vj → wj)(x) Then, we can compose r with r1 and the same replace term (bj, vj → wj) by term from the left hand side of r1: [(b1, v1 → w1) … [(bj1, vj1 → wj1) (bj2, vj2 → wj2) … ( bjq, vjq → wjq)] … ( bp, vp → wp)](x) (d, k1 → k2)(x)
Search Graph [Tzacheva & Ras]
In order to construct action rules of the lowest cost, we build Search Graph GS, which is a directed graph, that is dynamically built by applying action rules discovered from S to its nodes. The initial node n0 of the graph GS contains information coming from the user, associated with the system S, about what objects he/she would like to reclassify (ex. from the class described by value k1 of the attribute d to the class k2) and what is the current cost, S(k1, k2), of the reclassification k1 → k2 . Any other node n in GS shows an alternative way to achieve the same reclassification with a cost that is lower than the cost assigned to all nodes which are preceding n in GS.
Search Graph Assume that N is the set of nodes in graph GS and n0
is its initial node. For any node n N, by f(n) = (Yn, {[ vn,j → wn,j , S(vn,j, wn,j)]} j In) we mean its domain (set of objects in S), set of actions needed to reclassify objects from Yn, and their cost, where Yn X. We say that action rule r, discovered from S, is applicable to node n if: Yn RSupS(r) ≠ Ø (k In)[r RS[ vn,kj → wn,k]]
Minimal Cost Reclassification Search Graph for S.
n0 = {[ k1 → k2 , S (k1, k2)]} r = [(b1, v1 → w1) ^ (b2, v2 → w2)^ … ^( bp, vp → wp)](x) => (d, k1 → k2)(x) n1 = {[ v1 → w1 , S (v1, w2)], [ v2 → w2 , S (v2, w2)], …, [ vp → wp , S (vp, wp)]} r1 n2 n3 r4 rn nn rj Information System S RS [(d, k1 →, k2)] r2 r3 Figure 4. Lowest Cost Reclassification Search Graph in a standalone system S Minimal Cost Reclassification Search Graph for S.
Search Graph Properties
Property 1. Let f(n0) = (Y, {[k1 → k2, S(k1,k2)]}), f(n) = (Yn, {[ vn,,k → wn,,k , S (vn,,k, wn,,k)]}k In). The cost assigned to the node n for reclassifying x Yn from k1 to k2 is equal to: Costk1→k2(n, x) = {S(vn,,k, wn,,k): k In} Property 2. If node n2 is a successor of the node n1, then Confk1→k2(n2, x) Confk1→k2(n1, x) Property 3. If node n2 is a successor of the node n1, then Costk1→k2(n2, x) Costk1→k2(n1, x)
Search for Action Rules [Tzacheva & Ras]
We propose A* type algorithm for speeding up the construction of the shortest path from the root to the goal node in graph GS. A* is probably one of the most popular search algorithms in AI. It is an informed, optimal search algorithm, which uses a heuristic estimate of remaining distance to the goal by means of a heuristic function h(N) . We assume that user provides three threshold values: 1 - threshold for minimum confidence of action rules. 2 - threshold for maximum cost of action rules. 3 - threshold for minimum feasibility of action rules.
Heuristic Method - A* We assume that: h(ni) = [cost(ni,Yi) - 2]/3
Heuristic value h(ni) is associated with any node ni in G. It shows the maximal number of steps that might be needed to reach the goal. Also, we assume that: g(ni) is the number of edges to the current node Then, we associate an estimated path length to the goal for each node as follows: f(ni) = h(ni) + g(ni)
Proposed Algorithm - A*
Initialize Q with search node [([conf(no),h(no)],[no])] as the only entry; Initialize domain of no (given by user) as Yo. If Q is empty, fail. Else, pick search node s from Q with a least value of f. If two search nodes in Q have the same least value of f assigned to them, if an ontology is available, pick search node s from Q with the highest value of Ont(s). If state(s) is a goal and conf(s)1, return s (we have reached the goal). Otherwise remove s from Q. Find all children of state(s) and create all the one-step extensions of s to each descendant. If state(s1) is a child of state(s) and r is the action rule applied to s in order to move from s to s1, then initialize Ystate(s1) as Ystate(s) DomS(r) and if an ontology is available, Ont(s1) as Ont(r) 6. Add all the extended paths to Q; 7. Go to step 2.
Implementation and Testing
The heuristic strategy for lowest cost reclassification – LowestCostReclassifier software, is implemented in C++ using the Microsoft Visual Studio 7.0 IDE and compiler. The user is asked to enter the attribute in which he/she is interested in reclassifying, its current and the desired values. Also the user chooses the following 3 thresholds: 1 - minimum confidence of action rules 2 - maximum cost of action rules 3 - minimum feasibility of action rules. And the currently known to the user cost of reclassification The action rules have the following form: (attribute, valueFrom - > valueTo | cost ) => (attribute, valueFrom -> valueTo | cost) confidence The LowestCostReclassifier software was tested and applied to three different databases. Two in medical domain, and one in financial domain.
Conclusions We extract action rules as per the original algorithm presented in [62]. Next, we proposed a heuristic approach using A* algorithm of building a search graph G which will identify an action rule of the lowest cost considering three thresholds the user provides: min confidence, max cost, and min feasibility. Further, we observed that even the maximum cost threshold is not reachable, we will still return the best node found thus far, which cost would still be lower than the currently known cost to the user. In that sense, the leaves in our graph G and the nodes close to them would represent the most actionable knowledge and the same the mostly unexpected/interesting knowledge related to a desired reclassification of objects.
Final Claims Actionability measure = Cost of an action rule
Subjective measure: user-driven, domain-dependent. Include unexpectedness [Silberschatz and Tuzhilin, 1995], novelty, actionability [Piatesky-Shapiro & Matheus, 1994]. Claim 1 [Suzuki, Padmanabhan & Tuzhilin] Unexpectedness is partially an objective concept. A B is unexpected with respect to the belief on the dataset D if the following conditions hold: B = False [ B and logically contradict each other] A holds on a large subset of D A* B holds which means A* Our Claim: Actionability is partially an objective concept. Actionability measure = Cost of an action rule
Final Claims Questions? Thank You
Our Claim: the most cheap rules are most of actionable Claim 2 [Silberschatz & Tuzhilin] the most of actionable rules are unexpected Our Claim: The most cheap rules are unexpected References: Z. Ras, A. Tzacheva, L.-S. Tsay, “Action Rules”, in Encyclopedia of Data Warehousing and Mining, (Ed. J. Wang), Idea Group Inc., 2005, will appear A. Tzacheva, Z. Ras, "Action rules mining", in the Special Issue on Knowledge Discovery, International Journal of Intelligent Systems, Wiley, 2005, will appear A. Tzacheva, Z. Ras, “Discovering non-standard semantics of semi-stable attributes”, Proceedings of Flairs-2003, St. Augustine, Florida, AAAI Press, 2003, Questions? Thank You
