Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mining Unexpected Rules by Pushing User Dynamics

Similar presentations


Presentation on theme: "Mining Unexpected Rules by Pushing User Dynamics"— Presentation transcript:

1 Mining Unexpected Rules by Pushing User Dynamics
Ke Wang Yuelong Jiang Laks V.S. Lakshmanan

2 Unexpected Rules Unexpectedness: user finds the rules surprising
Existing approaches Syntax distance (B. Liu, W. Hsu, AAAI96) Logical contradiction (B. Padmanabhan, A. Tuzhilin, KDD98) Both by direct comparison between rules 2019/1/1 KDD03

3 Our approach: Data Violation
Knowledge rules Ui: The data rule r: unexpected to the user who links “owning house at BeverlyHill” to “movie stars” and “well paid” Each tuple that satisfies r but violates Ui is an evidence for unexpectedness of r 2019/1/1 KDD03

4 Three Issues Knowledge Dynamics Knowledge Push Unexpectedness Dynamics
User decides the best knowledge to apply given a scenario (i.e., a tuple) --- modeling Knowledge Push Push user knowledge right from the start of search --- rule mining Unexpectedness Dynamics Adjust the unexpectedness of remaining rules by what has been presented so far --- rule selection 2019/1/1 KDD03

5 Rule Representation Knowledge rules and data rules: Target attribute
Domain values in data rules, and fuzzy terms (such as “High”, “Low”) in knowledge rules. Match degree measures the match between a domain value (i.e., Primary) and a fuzzy term (i.e., Low) Target attribute 2019/1/1 KDD03

6 Main Ideas Preference model: the user specifies the “best” knowledge rules for each tuple e.g., U1 and U2 for those owning a house at BeverlyHill Violation model: we measure the unexpectedness of r by the “violation” of satisfying tuples to their best knowledge rules. 2019/1/1 KDD03

7 The Preference Model User specifies covering knowledge for each tuple:
d (covering depth) “best” knowledge rules that match the tuple Ways to specify “best”: Explicit enumeration (not scalable) Rank by preference: “max strength”, “best match”, “min violation”, etc. 2019/1/1 KDD03

8 The Violation Model For a tuple t and a knowledge rule U:
Body match degree, bm(t,U), in [0,1] Head match degree, hm(t,U), in [0,1] Violation of U by t Violation of t, v(t), is aggregated v(t,U) over the covering knowledge U of t. if bm(t, U)   otherwise 2019/1/1 KDD03

9 The Mining Problem Unexpectedness Support of r
Unexpectedness Confidence of r Unexpectedness of r Problem: Find all data rules r above specified thresholds for Usup and Ustr. Ustr 2019/1/1 KDD03

10 The Mining Algorithm Three Phases Violation Phase Rule Phase
Final Phase 2019/1/1 KDD03

11 Violation Phase Compute and store v(t) for all tuples t in the database T, pruning all t with v(t) = 0; get new database T’ prunes the data consistent with the user knowledge, very effective. 2019/1/1 KDD03

12 Rule Phase Generate all rules r with Usup(r) above threshold using T’
Usup(r) is anti-monotone Usup(r) decreases as the body b(r) grows independent of preference model and violation function v(t) Any frequent itemset algorithms can be applied in this phase 2019/1/1 KDD03

13 Final Phase Compute sup(r) and sup(b(r)) for rules produced in rule phase Output rules r with Ustr(r) above threshold. 2019/1/1 KDD03

14 The Selection Problem Display a specified number k of rules to the user, in the order of unexpectedness See-and-Know Assumption After seeing rules R, user is interested in only rules that are unexpected with respect to 2019/1/1 KDD03

15 The Selection Algorithm
At each step, greedily select the most unexpected rule (until k rules are selected or there is no rule to select) add the selected rule to user knowledge for each matching tuple, update the violation values to reflect the new covering knowledge. 2019/1/1 KDD03

16 Experiment Dataset KDD-CUP-98 Dataset Target Attribute
NK97: donation amount in 1997 campaign five scales: c0, c1, c2, c3, c4, in increasing order. 23 non-target attributes Their meanings are easier to understand than other attributes 2019/1/1 KDD03

17 User Knowledge Observation: People tend to remain unchanged in donation behaviors Four knowledge rules: 2019/1/1 KDD03

18 Efficiency of Mining Three Algorithms
UMINE(NULL), without user knowledge UMINE-Unpruned, without tuple pruning UMINE-Pruned, pruning those tuples with vt = 0 2019/1/1 KDD03

19 Interestingness of Rules
Ui(x,y): Ui covers x tuples with total violation y Violate two rules 2019/1/1 KDD03

20 Effectiveness of Selection
2019/1/1 KDD03

21 Conclusion A new approach for finding interesting rules by modeling user knowledge Violation of covering knowledge by satisfying tuples Model human user as a dynamic entity in applying knowledge and interpreting presented rules. Push user knowledge in data preparation, mining, and rule selection. This benefits both search and quality. 2019/1/1 KDD03


Download ppt "Mining Unexpected Rules by Pushing User Dynamics"

Similar presentations


Ads by Google