Rough Set Model Selection for Practical Decision Making Jeseph P. Herbert JingTao Yao Department of Computer Science University of Regina jtyao@cs.uregina.ca
Introduction Rough sets have been applied to many areas in order to aid decision making. Information (rules) derived from multi-attribute data helps users in making decisions. Rough set reducts minimize the strain on the user by giving them only the necessary information. J T Yao
Motivation Can we further utilize the strengths provided by rough sets in order to make more informed decisions? Can we differentiate the types of decisions that can be made from using various rough set methods? Can we provide some sort of support mechanism to the user to help them choose a suitable rough set method for their analysis? J T Yao
Rough Sets Developed in the early 1980s by Zdzislaw Pawlak. Sets derived from imperfect, imprecise, and incomplete data may not be able to be precisely defined. Sets must be approximated Using describable concepts to approximate known concept 1.76 cm => 1.7, 1.8 J T Yao
Information systems/tables and decision tables. Indiscernibility. Key Concepts Information systems/tables and decision tables. Indiscernibility. Set approximation. Reducts. J T Yao
Information Table: An Example I = (U, A) U = non-empty finite set of objects A = non-empty finite set of attributes such that: for all Object Date High Close 1 1-Jul-91 1434.98 1421.54 2 2-Jul-91 1473.99 3 3-Jul-91 1467.78 is the set of value for attribute a. J T Yao
Decision Table: An Example T = (U, A {d}) Object Date High Close Decision 1 1-Jul-91 1434.98 1421.54 2 2-Jul-91 1473.99 3 3-Jul-91 1467.78 -1 U = non-empty finite set of objects. A = non-empty finite set of conditional attributes. d = one or more decision attributes. J T Yao
Indiscernibility For any in I = ( ), there exists an equivalence relation: where is the B-indiscernibility relation. An equivalence relation partitions U into equivalence classes: J T Yao
Set Approximation Data may not precisely define distinct, crisp sets. A rough set has a lower and upper approximation. J T Yao
Visualize Rough Sets Let T = (U, A), , Lower Approximation: Upper Approximation: Boundary Region: J T Yao
Rough Set Methods for Data Analysis Two type of models are focused on: Algebraic Method Probabilistic Decision-theoretic Method, Variable-precision Method Each method has different strengths that can be used to improve decision making J T Yao
Types of Decisions Broadly, there are two main types of decisions that can be made using rough set analysis. Immediate decisions (Unambiguous). Delayed decisions (Ambiguous). We can further categorize decision types by looking at rough set method strengths. J T Yao
Immediate Decisions These types of decisions are based upon classification with the POS and NEG regions. The user can interpret findings as: Classification into POS regions can be considered a “yes” answer. Classification into NEG regions can be considered a “no” answer J T Yao
Delayed Decisions These types of decisions are based on classification in the BND region. A “wait-and-see” approach to decision making. A decision-maker can decrease ambiguity with the following: Obtain more information (more data). A decreased tolerance for acceptable loss (decision-theoretic) or user thresholds (variable-precision). J T Yao
Algebraic Decisions Decisions made from algebraic rough set analysis. Immediate If P(A|[x]) = 1, then x is in POS(A). If P(A|[x]) = 0, then x is in NEG(A). Delayed If 0 < P(A|[x]) < 1, then x is in BND(A). J T Yao
Variable-Precision Decisions Decisions made from variable-precision rough set analysis. User-defined thresholds u and l representing lower and upper bounds to define regions. Pure Immediate decisions. User-Accepted Immediate decisions. User-Rejected Immediate decisions. Delayed decisions. J T Yao
Variable-Precision Decisions Pure Immediate If P(A|[x]) = 1, then x is in POS1 (A). If P(A|[x]) = 0, then x is in NEG0 (A). User-Accepted Immediate If u ≤ P(A|[x]) < 1, then x is in POSu (A). User-Rejected Immediate If 0 < P(A|[x]) ≤ l, then x is in NEGl (A). Delayed If l < P(A|[x]) < u, then x is in BNDl,u (A). J T Yao
Decision-Theoretic Decisions Decisions made from decision-theoretic rough set analysis. Calculated cost (risk) using Bayesian decision procedure provides minimum α, β values for region division. Pure Immediate decisions. Accepted Loss Immediate decisions. Rejected Loss Immediate decisions. Delayed decisions. J T Yao
Decision-Theoretic Decisions Pure Immediate If P(A|[x]) = 1, then x is in POS1 (A). If P(A|[x]) = 0, then x is in NEG0 (A). Accepted Loss Immediate If α ≤ P(A|[x]) < 1, then x is in POSα (A). User-Rejected Immediate If 0 < P(A|[x]) ≤ β, then x is in NEGβ (A). Delayed If β < P(A|[x]) < α, then x is in BNDα, β (A). J T Yao
A Simple Example: Parking a Car Set of states: : meeting will be over in less than 2 hours, : meeting will be over in more than 2 hours. Set of actions: : park the car on meter : park the car on parking lot J T Yao
Costs of Parking Your Car (meter) (lot) (<= 2) $2.00 $7.00 (> 2) $12.00 J T Yao
Making Decision Based on Probabilities Assume that: Cost of each action: Take action : park the car on meter J T Yao
Determine the Probability Threshold for One Action The condition of taking action : park the car on meter: J T Yao
Relationships Amongst Rough Set Models Decision-theoretic model Variable-precision model Probabilistic rough set approximations Loss function Threshold values Baysian decision theory J T Yao
Summary of Decisions POS1 (A) POS(A) BND(A) POSu (A) BNDl,u (A) NEG(A) Region Decision Type POS1 (A) Pure Immediate POSu (A) User-accepted Immediate BNDl,u (A) Delayed NEGl (A) User-rejected Immediate NEG0 (A) Region Decision Type POS(A) Immediate BND(A) Delayed NEG(A) Pawlak Method Region Decision Type POS1 (A) Pure Immediate POSα (A) Accepted Loss Immediate BNDα, β (A) Delayed NEGβ (A) Rejected Loss Immediate NEG0 (A) Variable-Precision Method Decision-Theoretic Method J T Yao
Choosing a Method If the user is informed enough to provide thresholds, variable-precision rough sets can be used for data analysis. If cost or risk information is beneficial to the types of decisions being made, decision-theoretic rough sets can be used for data analysis. J T Yao
Conclusions We can utilize the strengths of various rough set methods in order to improve our decision making capability. The various rough set methods can each make different types of decisions. By determining what kind of decisions they wish to make, users can choose a suitable rough set method for data analysis to reach their goals. J T Yao
Rough Set Model Selection for Practical Decision Making Jeseph P. Herbert JingTao Yao Department of Computer Science University of Regina jtyao@cs.uregina.ca
Where is Regina? J T Yao