_ Rough Sets. Basic Concepts of Rough Sets _ Information/Decision Systems (Tables) _ Indiscernibility _ Set Approximation _ Reducts and Core.

Slides:



Advertisements
Similar presentations
Protein Secondary Structure Prediction Using BLAST and Relaxed Threshold Rule Induction from Coverings Leong Lee Missouri University of Science and Technology,
Advertisements

Chapter 9 -- Simplification of Sequential Circuits.
Rough Sets Tutorial.
Digital Circuits.
CSEE 4823 Advanced Logic Design Handout: Lecture #2 1/22/15
Relations Relations on a Set. Properties of Relations.
_ Rough Sets. Basic Concepts of Rough Sets _ Information/Decision Systems (Tables) _ Indiscernibility _ Set Approximation _ Reducts and Core _ Rough Membership.
Teoria podejmowania decyzji Relacja preferencji. Binarny relations – properties Pre-orders and orders, relation of rational preferences Strict preference.
Basic Structures: Sets, Functions, Sequences, Sums, and Matrices
© by Kenneth H. Rosen, Discrete Mathematics & its Applications, Sixth Edition, Mc Graw-Hill, 2007 Chapter 1: (Part 2): The Foundations: Logic and Proofs.
Basic Structures: Sets, Functions, Sequences, Sums, and Matrices
1 NP-Complete Problems. 2 We discuss some hard problems:  how hard? (computational complexity)  what makes them hard?  any solutions? Definitions 
April 9, 2015Applied Discrete Mathematics Week 9: Relations 1 Solving Recurrence Relations Another Example: Give an explicit formula for the Fibonacci.
Relations.
Rough Set Strategies to Data with Missing Attribute Values Jerzy W. Grzymala-Busse Department of Electrical Engineering and Computer Science University.
Rough Sets Theory Speaker:Kun Hsiang.
Computability and Complexity 9-1 Computability and Complexity Andrei Bulatov Logic Reminder (Cnt’d)
CHAPTER 8 Decision Making CS267 BY GAYATRI BOGINERI (12) & BHARGAV VADHER (14) October 22, 2007.
Chapter 9 Data Analysis CS267 By Anand Sivaramakrishnan.
NP-Complete Problems Reading Material: Chapter 10 Sections 1, 2, 3, and 4 only.
August 2005RSFDGrC 2005, Regina, Canada 1 Feature Selection Based on Relative Attribute Dependency: An Experimental Study Jianchao Han 1, Ricardo Sanchez.
Uncertainty Measure and Reduction in Intuitionistic Fuzzy Covering Approximation Space Feng Tao Mi Ju-Sheng.
Equivalence Relations: Selected Exercises
ECE 667 Synthesis and Verification of Digital Systems
Introduction to Software Testing Chapter 8.2
Chapter 7 Reasoning about Knowledge by Neha Saxena Id: 13 CS 267.
Sets.
Zvi Kohavi and Niraj K. Jha 1 Capabilities, Minimization, and Transformation of Sequential Machines.
On Applications of Rough Sets theory to Knowledge Discovery Frida Coaquira UNIVERSITY OF PUERTO RICO MAYAGÜEZ CAMPUS
CSE & CSE6002E - Soft Computing Winter Semester, 2011 More Rough Sets.
Ordered Sets. Relations. Equivalence Relations 1.
Chapter 3 – Set Theory  .
Discrete Math for CS Binary Relation: A binary relation between sets A and B is a subset of the Cartesian Product A x B. If A = B we say that the relation.
Optimization Algorithm
3. Rough set extensions  In the rough set literature, several extensions have been developed that attempt to handle better the uncertainty present in.
Chapter 9. Section 9.1 Binary Relations Definition: A binary relation R from a set A to a set B is a subset R ⊆ A × B. Example: Let A = { 0, 1,2 } and.
From Rough Set Theory to Evidence Theory Roman Słowiński Laboratory of Intelligent Decision Support Systems Institute of Computing Science Poznań University.
Reducing the Response Time for Data Warehouse Queries Using Rough Set Theory By Mahmoud Mohamed Al-Bouraie Yasser Fouad Mahmoud Hassan Wesam Fathy Jasser.
Introduction to Software Testing Chapter 3.6 Disjunctive Normal Form Criteria Paul Ammann & Jeff Offutt
Math Chapter 6 Part II. POWER SETS In mathematics, given a set S, the power set of S, written P(S) or 2 n(S), is the set of all subsets of S. Remember.
Richard Jensen, Andrew Tuson and Qiang Shen Qiang Shen Aberystwyth University, UK Richard Jensen Aberystwyth University, UK Andrew Tuson City University,
Chapter 2: Basic Structures: Sets, Functions, Sequences, and Sums (1)
CompSci 102 Discrete Math for Computer Science
Mathematical Preliminaries
Rough Set Theory. 2 Introduction _ Fuzzy set theory –Introduced by Zadeh in 1965 [1] –Has demonstrated its usefulness in chemistry and in other disciplines.
Problem Statement How do we represent relationship between two related elements ?
Introduction to Software Testing Chapter 3.6 Disjunctive Normal Form Criteria Paul Ammann & Jeff Offutt
R. Johnsonbaugh Discrete Mathematics 5 th edition, 2001 Chapter 9 Boolean Algebras and Combinatorial Circuits.
Chapter 8: Relations. 8.1 Relations and Their Properties Binary relations: Let A and B be any two sets. A binary relation R from A to B, written R : A.
Boolean Functions 1 ECE 667 ECE 667 Synthesis and Verification of Digital Circuits Boolean Functions Basics Maciej Ciesielski Univ.
RelationsCSCE 235, Spring Introduction A relation between elements of two sets is a subset of their Cartesian products (set of all ordered pairs.
ORDERED SETS. RELATIONS COSC-1321 Discrete Structures 1.
Chapter 2 1. Chapter Summary Sets (This Slide) The Language of Sets - Sec 2.1 – Lecture 8 Set Operations and Set Identities - Sec 2.2 – Lecture 9 Functions.
Rough Set Theory and Databases Senior Lecturer: Laurie Webster II, M.S.S.E.,M.S.E.E., M.S.BME, Ph.D., P.E. Lecture 28 A First Course in Database Systems.
Rough Sets, Their Extensions and Applications 1.Introduction  Rough set theory offers one of the most distinct and recent approaches for dealing with.
Set. Outline Universal Set Venn Diagram Operations on Sets.
Binary Relation: A binary relation between sets A and B is a subset of the Cartesian Product A x B. If A = B we say that the relation is a relation on.
More Rough Sets.
Capabilities, Minimization, and Transformation of Sequential Machines
Equivalence Relations: Selected Exercises
Review: Discrete Mathematics and Its Applications
Relations and Digraphs
Rough Sets.
Rough Sets.
Introductory Material
Optimization Algorithm
Review: Discrete Mathematics and Its Applications
Rough Set Theory.
Rough Sets (Theoretical Aspects of Reasoning about Data)
Foundations of Discrete Mathematics
Presentation transcript:

_ Rough Sets

Basic Concepts of Rough Sets _ Information/Decision Systems (Tables) _ Indiscernibility _ Set Approximation _ Reducts and Core

Information Systems/Tables _ IS is a pair (U, A) _ U is a non-empty finite set of objects. _ A is a non-empty finite set of attributes such that for every _ is called the value set of a. Age LEMS x 1 x x x x x x

Decision Systems/Tables _ DS: _ is the decision attribute (instead of one we can consider more decision attributes). _ The elements of A are called the condition attributes. Age LEMS Walk x 1 yes x no x no x yes x no x yes x no

Indiscernibility _ The equivalence relation A binary relation which is reflexive (xRx for any object x), symmetric (if xRy then yRx), and transitive (if xRy and yRz then xRz). _ The equivalence class of an element consists of all objects such that xRy.

Indiscernibility (2) _ Let IS = (U, A) be an information system, then with any there is an associated equivalence relation: where is called the B-indiscernibility relation. _ If then objects x and x’ are indiscernible from each other by attributes from B. _ The equivalence classes of the B-indiscernibility relation are denoted by

An Example of Indiscernibility _ The non-empty subsets of the condition attributes are {Age}, {LEMS}, and {Age, LEMS}. _ IND({Age}) = {{x1,x2,x6}, {x3,x4}, {x5,x7}} _ IND({LEMS}) = {{x1}, {x2}, {x3,x4}, {x5,x6,x7}} _ IND({Age,LEMS}) = {{x1}, {x2}, {x3,x4}, {x5,x7}, {x6}}. Age LEMS Walk x 1 yes x no x no x yes x no x yes x no

Set Approximation _ Let T = (U, A) and let and We can approximate X using only the information contained in B by constructing the B-lower and B-upper approximations of X, denoted and respectively, where

Set Approximation (2) _ B-boundary region of X, consists of those objects that we cannot decisively classify into X in B. _ A set is said to be rough if its boundary region is non-empty, otherwise the set is crisp.

An Example of Set Approximation _ Let W = {x | Walk(x) = yes}. _ The decision class, Walk, is rough since the boundary region is not empty. Age LEMS Walk x 1 yes x no x no x yes x no x yes x no

An Example of Set Approximation (2) yes yes/no no {{x1},{x6}} {{x3,x4}} {{x2}, {x5,x7}} AWAW

U set X U/R R : subset of attributes Lower & Upper Approximations

Lower & Upper Approximations (2) Lower Approximation: Upper Approximation:

Lower & Upper Approximations (3) X1 = {u | Flu(u) = yes} = {u2, u3, u6, u7} RX1 = {u2, u3} = {u2, u3, u6, u7, u8, u5} X2 = {u | Flu(u) = no} = {u1, u4, u5, u8} RX2 = {u1, u4} = {u1, u4, u5, u8, u7, u6} The indiscernibility classes defined by R = {Headache, Temp.} are {u1}, {u2}, {u3}, {u4}, {u5, u7}, {u6, u8}.

Lower & Upper Approximations (4) R = {Headache, Temp.} U/R = { {u1}, {u2}, {u3}, {u4}, {u5, u7}, {u6, u8}} X1 = {u | Flu(u) = yes} = {u2,u3,u6,u7} X2 = {u | Flu(u) = no} = {u1,u4,u5,u8} RX1 = {u2, u3} = {u2, u3, u6, u7, u8, u5} RX2 = {u1, u4} = {u1, u4, u5, u8, u7, u6} u1 u4 u3 X1 X2 u5u7 u2 u6u8

Accuracy of Approximation where |X| denotes the cardinality of Obviously If X is crisp with respect to B. If X is rough with respect to B.

Issues in the Decision Table _ The same or indiscernible objects may be represented several times. _ Some of the attributes may be superfluous (redundant). That is, their removal cannot worsen the classification.

Reducts _ Keep only those attributes that preserve the indiscernibility relation and, consequently, set approximation. _ There are usually several such subsets of attributes and those which are minimal are called reducts.

Dispensable & Indispensable Attributes Let Attribute c is dispensable in T if, otherwise attribute c is indispensable in T. The C-positive region of D :

Independent _ T = (U, C, D) is independent if all are indispensable in T.

Reduct & Core _ The set of attributes is called a reduct of C, if T’ = (U, R, D) is independent and _ The set of all the condition attributes indispensable in T is denoted by CORE(C). where RED(C) is the set of all reducts of C.

An Example of Reducts & Core Reduct1 = {Muscle-pain,Temp.} Reduct2 = {Headache, Temp.} CORE = {Headache,Temp} {MusclePain, Temp} = {Temp}

Discernibility Matrix (relative to positive region) _ Let T = (U, C, D) be a decision table, with By a discernibility matrix of T, denoted M(T), we will mean matrix defined as: for i, j = 1,2,…,n such that or belongs to the C-positive region of D. _ is the set of all the condition attributes that classify objects ui and uj into different classes.

Discernibility Matrix (relative to positive region) (2) _ The equation is similar but conjunction is taken over all non-empty entries of M(T) corresponding to the indices i, j such that or belongs to the C-positive region of D. _ denotes that this case does not need to be considered. Hence it is interpreted as logic truth. _ All disjuncts of minimal disjunctive form of this function define the reducts of T (relative to the positive region).

Discernibility Function (relative to objects) _ For any where (1) is the disjunction of all variables a such that if (2) if (3) if Each logical product in the minimal disjunctive normal form (DNF) defines a reduct of instance

Examples of Discernibility Matrix No a b c d u1 a0 b1 c1 y u2 a1 b1 c0 n u3 a0 b2 c1 n u4 a1 b1 c1 y C = {a, b, c} D = {d} In order to discern equivalence classes of the decision attribute d, to preserve conditions described by the discernibility matrix for this table u1 u2 u3 u2 u3 u4 a,c b c a,b Reduct = {b, c}

Examples of Discernibility Matrix (2) u1 u2 u3 u4 u5 u6 u2 u3 u4 u5 u6 u7 b,c,d b,c b b,d c,d a,b,c,d a,b,c a,b,c,d a,b,c,d a,b c,d c,d Core = {b} Reduct1 = {b,c} Reduct2 = {b,d}

Discretization _ In the discretization of a decision table T = where is an interval of real values, we search for a partition of for any _ Any partition of is defined by a sequence of the so-called cuts from _ Any family of partitions can be identified with a set of cuts.

Discretization (2) In the discretization process, we search for a set of cuts satisfying some natural conditions. U a b d x x x x x x x U a b d x x x x x x x PP P = {(a, 0.9), (a, 1.5), (b, 0.75), (b, 1.5)}

A Geometrical Representation of Data a b x1 x2 x3 x4 x7 x5 x6

A Geometrical Representation of Data and Cuts a b x1 x2 x3 x4 x5 x6 x7

Discretization (3) _ The sets of possible values of a and b are defined by _ The sets of values of a and b on objects from U are given by a(U) = {0.8, 1, 1.3, 1.4, 1.6}; b(U) = {0.5, 1, 2, 3}.

Discretization Based on RSBR (4) _ The discretization process returns a partition of the value sets of condition attributes into intervals.

A Discretization Process _ Step 1: define a set of Boolean variables, where corresponds to the interval [0.8, 1) of a corresponds to the interval [1, 1.3) of a corresponds to the interval [1.3, 1.4) of a corresponds to the interval [1.4, 1.6) of a corresponds to the interval [0.5, 1) of b corresponds to the interval [1, 2) of b corresponds to the interval [2, 3) of b

The Set of Cuts on Attribute a a

A Discretization Process (2) _ Step 2: create a new decision table by using the set of Boolean variables defined in Step 1. Let be a decision table, be a propositional variable corresponding to the interval for any and

A Sample Defined in Step 2 U* (x1,x2) (x1,x3) (x1,x5) (x4,x2) (x4,x3) (x4,x5) (x6,x2) (x6,x3) (x6,x5) (x7,x2) (x7,x3) (x7,x5)

The Discernibility Formula _ The discernibility formula means that in order to discern object x1 and x2, at least one of the following cuts must be set, a cut between a(0.8) and a(1) a cut between b(0.5) and b(1) a cut between b(1) and b(2).

The Discernibility Formulae for All Different Pairs

The Discernibility Formulae for All Different Pairs (2)

A Discretization Process (3) _ Step 3: find the minimal subset of p that discerns all objects in different decision classes. The discernibility boolean propositional formula is defined as follows,

The Discernibility Formula in CNF Form

The Discernibility Formula in DNF Form _ We obtain four prime implicants, is the optimal result, because it is the minimal subset of P.

The Minimal Set Cuts for the Sample DB a b x1 x2 x3 x4 x5 x6 x7

A Result U a b d x x x x x x x U a b d x x x x x x x PP P = {(a, 1.2), (a, 1.5), (b, 1.5)}

Searching for Reducts Condition Attributes: a: Va = {1, 2} b: Vb = {0, 1, 2} c: Vc = {0, 1, 2} d: Vd = {0, 1} Decision Attribute: e: Ve = {0, 1, 2}

Searching for CORE Removing attribute a Removing attribute a does not cause inconsistency. Hence, a is not used as CORE.

Searching for CORE (2) Removing attribute b Removing attribute b cause inconsistency. Hence, b is used as CORE.

Searching for CORE (3) Removing attribute c does not cause inconsistency. Hence, c is not used as CORE.

Searching for CORE (4) Removing attribute d does not cause inconsistency. Hence, d is not used as CORE.

Searching for CORE (5) CORE(C)={b} Initial subset R = {b} Attribute b is the unique indispensable attribute.

R={b} The instances containing b0 will not be considered. TT’

Attribute Evaluation Criteria _ Selecting the attributes that cause the number of consistent instances to increase faster –To obtain the subset of attributes as small as possible _ Selecting the attribute that has smaller number of different values –To guarantee that the number of instances covered by a rule is as large as possible.

Selecting Attribute from {a,c,d} 1. Selecting {a} R = {a,b} u3,u5,u6 u4 u7 U/{e} u3 u4 u7 U/{a,b} u5 u6

Selecting Attribute from {a,c,d} (2) 2. Selecting {c} R = {b,c} u3,u5,u6 u4 u7 U/{e}

Selecting Attribute from {a,c,d} (3) 3. Selecting {d} R = {b,d} u3,u5,u6 u4 u7 U/{e}

Selecting Attribute from {a,c,d} (4) 3. Selecting {d} R = {b,d} Result: Subset of attributes = {b, d} u3,u5,u6 u4 u7 U/{e} u3, u4 u7 U/{b,d} u5,u6