On Applications of Rough Sets theory to Knowledge Discovery Frida Coaquira UNIVERSITY OF PUERTO RICO MAYAGÜEZ CAMPUS

Slides:



Advertisements
Similar presentations
Protein Secondary Structure Prediction Using BLAST and Relaxed Threshold Rule Induction from Coverings Leong Lee Missouri University of Science and Technology,
Advertisements

Rough Sets in Data Mining CSE5610 Intelligent Software Systems Semester 1, 2006.
Introduction to Set Theory
Rough Sets Tutorial.
DECISION TREES. Decision trees  One possible representation for hypotheses.
_ Rough Sets. Basic Concepts of Rough Sets _ Information/Decision Systems (Tables) _ Indiscernibility _ Set Approximation _ Reducts and Core _ Rough Membership.
Feature Grouping-Based Fuzzy-Rough Feature Selection Richard Jensen Neil Mac Parthaláin Chris Cornelis.
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
Fuzzy Sets and Fuzzy Logic Theory and Applications
Huge Raw Data Cleaning Data Condensation Dimensionality Reduction Data Wrapping/ Description Machine Learning Classification Clustering Rule Generation.
Rough Set Strategies to Data with Missing Attribute Values Jerzy W. Grzymala-Busse Department of Electrical Engineering and Computer Science University.
Rough Sets Theory Speaker:Kun Hsiang.
WRSTA, 13 August, 2006 Rough Sets in Hybrid Intelligent Systems For Breast Cancer Detection By Aboul Ella Hassanien Cairo University, Faculty of Computer.
1 CLUSTERING  Basic Concepts In clustering or unsupervised learning no training data, with class labeling, are available. The goal becomes: Group the.
_ Rough Sets. Basic Concepts of Rough Sets _ Information/Decision Systems (Tables) _ Indiscernibility _ Set Approximation _ Reducts and Core.
Lecture 21 Rule discovery strategies LERS & ERID.
Chapter 3 Review MATH130 Heidi Burgiel. Relation A relation R from X to Y is any subset of X x Y The matrix of a Relation R is a matrix that has a 1 in.
Decision Tree Algorithm
Chap3 Reduction of Knowledge - Dongyi Jia - CS267 ID:104 - Fall 2008.
Xi’an Jiaotong University Title: Attribute reduction in decision systems based on relation matrix Authors: Cheng Zhong and Jin-hai Li.
Mining Hierarchical Decision Rules from Hybrid Data with Categorical and Continuous Valued Attributes Miao Duoqian, Qian Jin, Li Wen, Zhang Zehua.
August 2005RSFDGrC 2005, Regina, Canada 1 Feature Selection Based on Relative Attribute Dependency: An Experimental Study Jianchao Han 1, Ricardo Sanchez.
Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.
1 Chapter 2 Imprecise Categories, Approximations and Rough Sets #6 ChenKuang(Andy) Yang.
Uncertainty Measure and Reduction in Intuitionistic Fuzzy Covering Approximation Space Feng Tao Mi Ju-Sheng.
Johanna GOLD Rough Sets Theory Logical Analysis of Data. Monday, November 26, 2007.
Equivalence Relations: Selected Exercises
Chapter 7 Reasoning about Knowledge by Neha Saxena Id: 13 CS 267.
Unsupervised Rough Set Classification Using GAs Reporter: Yanan Yean.
The application of rough sets analysis in activity-based modelling. Opportunities and constraints Speaker: Yanan Yean.
ROUGH SET THEORY AND FUZZY LOGIC BASED WAREHOUSING OF HETEROGENEOUS CLINICAL DATABASES Yiwen Fan.
Numerical characterizations of covering rough sets based on evidence theory Chen Degang, Zhang Xiao Department of Mathematics and Physics, North China.
Fuzzy-rough data mining Richard Jensen Advanced Reasoning Group University of Aberystwyth
CSE & CSE6002E - Soft Computing Winter Semester, 2011 Finish Fuzzy Sets and Logic Begin Rough Sets.
CSE & CSE6002E - Soft Computing Winter Semester, 2011 More Rough Sets.
Machine Learning CSE 681 CH2 - Supervised Learning.
Classifying Attributes with Game- theoretic Rough Sets Nouman Azam and JingTao Yao Department of Computer Science University of Regina CANADA S4S 0A2
Basic Data Mining Technique
Discrete Math for CS Binary Relation: A binary relation between sets A and B is a subset of the Cartesian Product A x B. If A = B we say that the relation.
3. Rough set extensions  In the rough set literature, several extensions have been developed that attempt to handle better the uncertainty present in.
Data Mining Knowledge on rough set theory SUSHIL KUMAR SAHU.
From Rough Set Theory to Evidence Theory Roman Słowiński Laboratory of Intelligent Decision Support Systems Institute of Computing Science Poznań University.
April 14, 2015Applied Discrete Mathematics Week 10: Equivalence Relations 1 Properties of Relations Definition: A relation R on a set A is called transitive.
Richard Jensen, Andrew Tuson and Qiang Shen Qiang Shen Aberystwyth University, UK Richard Jensen Aberystwyth University, UK Andrew Tuson City University,
CSE 20: Discrete Mathematics for Computer Science Prof. Shachar Lovett.
ICS 253: Discrete Structures I Induction and Recursion King Fahd University of Petroleum & Minerals Information & Computer Science Department.
Rough Set Theory. 2 Introduction _ Fuzzy set theory –Introduced by Zadeh in 1965 [1] –Has demonstrated its usefulness in chemistry and in other disciplines.
Peter Scully Investigating Rough Set Feature Selection for Gene Expression Analysis.
Discretization. 1.Introduction 2.Perspectives and Background 3.Properties and Taxonomy 4.Experimental Comparative Analysis.
Dominance-Bases Rough Set Approach: Features, Extensions and Application Krzysztof Dembczyński Institute of Computing Science, Poznań University of Technology,
Panel Discussion on Granular Computing at RSCTC2004 J. T. Yao University of Regina Web:
Data Mining and Decision Support
Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.
D EPENDENCIES IN K NOWLEDGE B ASE By: Akhil Kapoor Manandeep Singh Bedi.
MAT 2720 Discrete Mathematics Section 3.3 Relations
Theory of Computing Topics Formal languages automata computability and related matters Purposes To know the foundations and principles of computer science.
Theory of Computing Topics Formal languages automata computability and related matters Purposes To know the foundations and principles of computer science.
Rough Set Theory and Databases Senior Lecturer: Laurie Webster II, M.S.S.E.,M.S.E.E., M.S.BME, Ph.D., P.E. Lecture 28 A First Course in Database Systems.
Rough Sets, Their Extensions and Applications 1.Introduction  Rough set theory offers one of the most distinct and recent approaches for dealing with.
Binary Relation: A binary relation between sets A and B is a subset of the Cartesian Product A x B. If A = B we say that the relation is a relation on.
More Rough Sets.
Finish Fuzzy Sets and Logic Begin Rough Sets
Rough Sets.
Rough Sets.
Set Theory A B C.
Rough Set Theory.
Dependencies in Structures of Decision Tables
Rough Sets (Theoretical Aspects of Reasoning about Data)
ICOM 5016 – Introduction to Database Systems
ICOM 5016 – Introduction to Database Systems
Presentation transcript:

On Applications of Rough Sets theory to Knowledge Discovery Frida Coaquira UNIVERSITY OF PUERTO RICO MAYAGÜEZ CAMPUS

Introduction One goal of the Knowledge Discovery is extract meaningful knowledge. Rough Sets theory was introduced by Z. Pawlak (1982) as a mathematical tool for data analysis. Rough sets have many applications in the field of Knowledge Discovery: feature selection, discretization process, data imputations and create decision Rules. Rough set have been introduced as a tool to deal with, uncertain Knowledge in Artificial Intelligence Application.

Equivalence Relation Let X be a set and let x, y, and z be elements of X. An equivalence relation R on X is a Relation on X such that: Reflexive Property: xRx for all x in X. Symmetric Property: if xRy, then yRx. Transitive Property: if xRy and yRz, then xRz.

Rough Sets Theory Let, be a Decision system data, Where: U is a non-empty, finite set called the universe, A is a non-empty finite set of attributes, C and D are subsets of A, Conditional and Decision attributes subsets respectively. for is called the value set of a, The elements of U are objects, cases, states, observations. The Attributes are interpreted as features, variables, characteristics conditions, etc.

Indiscernibility Relation The Indecernibility relation IND(P) is an equivalence relation. Let,, the indiscernibility relation IND(P), is defined as follows: for all

Indiscernibility Relation The indiscernibility relation defines a partition in U. Let, U/IND(P) denotes a family of all equivalence classes of the relation IND(P), called elementary sets. Two other equivalence classes U/IND(C) and U/IND(D), called condition and decision equivalence classes respectively, can also be defined.

R-lower approximation Let and, R is a subset of conditional features, then the R-lower approximation set of X, is the set of all elements of U which can be with certainty classified as elements of X. R-lower approximation set of X is a subset of X

R-upper approximation the R-upper approximation set of X, is the set of all elements of U such that: X is a subset of R-upper approximation set of X. R-upper approximation contains all data which can possibly be classified as belonging to the set X the R-Boundary set of X is defined as:

Representation of the approximation sets If then, X is R-definible (the boundary set is empty) If then X is Rough with respect to R. ACCURACY := Card(Lower)/ Card (Upper)

Decision Class The decision d determines the partition of the universe U. Where for will be called the classification of objects in T determined by the decision d. The set X k is called the k-th decision class of T

Decision Class This system data information has 3 classes, We represent the partition: lower approximation, upper approximation and boundary set.

Rough Sets Theory Lets consider U={x 1, x 2, x 3, x 4, x 5, x 6, x 7, x 8 } and the equivalence relation R with the equivalence classes: X 1 ={x 1,x 3,x 5 }, X 2 ={x 2,x 4 }and X 3 ={x 6,x 7,x 8 } is a Partition. Let the classification C={Y 1,Y 2,Y 3 } such that Y 1 ={x 1, x 2, x 4 }, Y 2 ={x 3, x 5, x 8 }, Y 3 ={x 6, x 7 } Only Y 1 has lower approximation, i.e.

Positive region and Reduct Positive region POS R (d) is called the positive region of classification CLASS T (d) is equal to the union of all lower approximation of decision classes. Reducts,are defined as minimal subset of condition attributes which preserve positive region defined by the set of all condition attributes, i.e. A subset is a relative reduct iff 1, 2 For every proper subset condition 1 is not true.

Dependency coefficient Is a measure of association, Dependency coefficient between condition attributes A and a decision attribute d is defined by the formula: Where, Card represent the cardinality of a set.

Discernibility matrix Let U={x 1, x 2, x 3,…, x n } the universe on decision system Data. Discernibility matrix is defined by:, where, is the set of all attributes that classify objects x i and x j into different decision classes in U/D partition. for some i, j }.

Dispensable feature Let R a family of equivalence relations and let P R, P is dispensable in R if IND(R) = IND(R-{P}), otherwise P is indispensable in R. CORE The set of all indispensable relation in C will be called the core of C. CORE(C)= ∩RED(C), where RED(C) is the family of all reducts of C.

Small Example Let, the universe set., the conditional features set., Decision features set. d {,,{,{{,{,{,,,{,,{,,,{,,,,,{,,,,,,,

Discernibility Matrix

Example Then, the Core(C) = {a 2 } The partition produces by Core is U/{a 2 } = {{ x 1,x 2 },{x 5, x 6,x 7 },{x 3,x 4 }}, and the partition produces by the decision feature d is U/{d}={{ x 4 },{ x 1,x 2,x 7 },{x 3,x 5,x 6 }}

Similarity relation A similarity relation on the set of objects is, It contain all objects similar to x. Lower approximation, is the set of all element of U which can be with certainty classified as elements of X. Upper approximation SIM-Possitive region of partition Let

Similarity measures a b are parameters, this measure is not symmetric. Similarity for nominal attribute

Quality of approximation of classification Is the ratio of all correctly classified objects to all objects. Relative Reduct is s relative reduct for SIM A {d} iff 1) 2) for every proper subset condition 1) is not true.

Attribute Reduction The purpose is select a subset of attributes from an Original set of attributes to use in the rest of the process. Selection criteria: Reduct concept description. Reduct is the essential part of the knowledge, which define all basic concepts. Other methods are: Discernibility matrix (n×n) Generate all combination of attributes and then evaluate the classification power or dependency coefficient (complete search).

Discretization Methods The purpose is development an algorithm that find a consistent set of cuts point which minimizes the number of Regions that are consistent. Discretization methods based on Rough set theory try to find These cutpoints A set of S points P1, …, Pn in the plane R2, partitioned into two disjoint categories S1, S2 and a natural number T. Is there a consistent set of lines such that the partition of the plane into region defined by them consist of at most T regions?

Consistent Def. A set of cuts P is consistent with A (or A-consistent) iff, where and are general decisions of A and A P respectively. Def. A set P irr of cuts is A-irreducible iff P irr is A-consistent and any its proper subfamily P’ ( P’ PP irr ) is not A-inconsistent.

Level of Inconsistency Let B a subset of A and Where X i is a classification of U and, i = 1,2,…,n L c represents the percentage of instances which can be Correctly classified into class X i with respect to subset B.

Imputation Data The rules of the system should have Maximum in terms of consistency. The relevant attributes for x is defined by. is defined } And the relation for all x and y are consistent if. Example Let x=(1,3,?,4), y=(2,?,5,4) and z=(1,?,5,4) x and z are consistent x and y are not consistent

Decision rules F1F2F3F4DRules O30001LR1 O50013LR1 O10102LR2 O40110MR3 O21102HR4 Rule1 if (F2=0) then (D=L) Rule2 if (F1=0) then (D=L) Rule3 if (F4=0) then (D=M) Rule4 if (F1=0) then (D=H) The algorithm should minimize the number of features included in decision rules.

References [1] Gediga, G. And Duntsch, I. (2002) Maximum Consistency of Incomplete Data Via Non-invasive Imputation. Artificial Intelligence. [2] Grzymala, J. and Siddhave, S. (2004) Rough set Approach to Rule Induction from Incomplete Data. Proceeding of the IPMU’2004, the10th International Conference on information Processing and Management of Uncertainty in Knowledge-Based System. [3] Pawlak, Z. (1995) Rough sets. Proccedings of the 1995 ACM 23rd annual conference on computer science. [4]Tay, F. and Shen, L. (2002) A modified Chi2 Algorithm for Discretization. In IEEE Transaction on Knowledge and Data engineering, Vol 14, No. 3 may/june. [5] Zhong, N. (2001) Using Rough Sets with Heuristics for Feature Selection. Journal of Intelligent Information Systems, 16, , Kluwer Academic Publishers.

THANK YOU!