1 Fuzzy Versus Quantitative Association Rules: A Fair Data-Driven Comparison Shih-Ming Bai and Shyi-Ming Chen Department of Computer Science and Information.

Slides:



Advertisements
Similar presentations
Fuzzy-Rough Feature Significance for Fuzzy Decision Trees Advanced Reasoning Group Department of Computer Science The University of Wales, Aberystwyth.
Advertisements

Association Analysis (Data Engineering). Type of attributes in assoc. analysis Association rule mining assumes the input data consists of binary attributes.
Ch2 Data Preprocessing part3 Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
Combining Classification and Model Trees for Handling Ordinal Problems D. Anyfantis, M. Karagiannopoulos S. B. Kotsiantis, P. E. Pintelas Educational Software.
Edi Winarko, John F. Roddick
Sampling Large Databases for Association Rules ( Toivenon’s Approach, 1996) Farzaneh Mirzazadeh Fall 2007.
Fuzzy Expert Systems. Lecture Outline What is fuzzy thinking? What is fuzzy thinking? Fuzzy sets Fuzzy sets Linguistic variables and hedges Linguistic.
A Fuzzy-Based Assessment Model for Faculty Performance Evaluation Mohammed Onimisi Yahaya College of Computer Sciences and Engineering King Fahd University.
Integrating Bayesian Networks and Simpson’s Paradox in Data Mining Alex Freitas University of Kent Ken McGarry University of Sunderland.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Jierui Xie, Boleslaw Szymanski, Mohammed J. Zaki Department of Computer Science Rensselaer Polytechnic Institute Troy, NY 12180, USA {xiej2, szymansk,
On the use of fuzzy techniques in cache memory management Chun-Fu Kung System Laboratory, Department of Computer Engineering and Science, Yuan-Ze University,
Reduced Support Vector Machine
Is Sampling Useful in Data Mining? A Case in the Maintenance of Discovered Association Rules S.D. Lee, D. W. Cheung, B. Kao Department of Computer Science.
1 Regular expression matching with input compression : a hardware design for use within network intrusion detection systems Department of Computer Science.
© Prentice Hall1 DATA MINING Introductory and Advanced Topics Part II Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist.
JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 30, (2014) BERLIN CHEN, YI-WEN CHEN, KUAN-YU CHEN, HSIN-MIN WANG2 AND KUEN-TYNG YU Department of Computer.
The Equivalence between fuzzy logic controllers and PD controllers for single input systems Professor: Chi-Jo Wang Student: Nguyen Thi Hoai Nam Student.
Mathematics.
01 March 2009Instructor: Tasneem Darwish1 University of Palestine Faculty of Applied Engineering and Urban Planning Software Engineering Department Introduction.
A Topology-based ECO Routing Methodology for Mask Cost Minimization Po-Hsun Wu, Shang-Ya Bai, and Tsung-Yi Ho Department of Computer Science and Information.
Ch2 Data Preprocessing part2 Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
Correlation.
14 Elements of Nonparametric Statistics
© Copyright McGraw-Hill CHAPTER 1 The Nature of Probability and Statistics.
Packet Classification Using Multi-Iteration RFC Author: Chun-Hui Tsai, Hung-Mao Chu, Pi-Chung Wang Publisher: COMPSACW, 2013 IEEE 37th Annual (Computer.
Chapter 10. Sampling Strategy for Building Decision Trees from Very Large Databases Comprising Many Continuous Attributes Jean-Hugues Chauchat and Ricco.
Statistical Estimation of Word Acquisition with Application to Readability Prediction Proceedings of the 2009 Conference on Empirical Methods in Natural.
Differentially Private Data Release for Data Mining Noman Mohammed*, Rui Chen*, Benjamin C. M. Fung*, Philip S. Yu + *Concordia University, Montreal, Canada.
Outline Introduction Descriptive Data Summarization Data Cleaning Missing value Noise data Data Integration Redundancy Data Transformation.
Numerical Methods Part: Simpson Rule For Integration.
Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan.
My Research Work and Clustering Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2010.
High-Speed Packet Classification Using Binary Search on Length Authors: Hyesook Lim and Ju Hyoung Mun Presenter: Yi-Sheng, Lin ( 林意勝 ) Date: Jan. 14, 2008.
Mining various kinds of Association Rules
A New Method to Forecast Enrollments Using Fuzzy Time Series and Clustering Techniques Kurniawan Tanuwijaya 1 and Shyi-Ming Chen 1, 2 1 Department of Computer.
Multi-Relational Data Mining: An Introduction Joe Paulowskey.
1 FINDING FUZZY SETS FOR QUANTITATIVE ATTRIBUTES FOR MINING OF FUZZY ASSOCIATE RULES By H.N.A. Pham, T.W. Liao, and E. Triantaphyllou Department of Industrial.
Logical Systems and Knowledge Representation Fuzzy Logical Systems 1.
Boundary Detection in Tokenizing Network Application Payload for Anomaly Detection Rachna Vargiya and Philip Chan Department of Computer Sciences Florida.
Correlation of Solid Solubility for Biological Compounds in Supercritical Carbon Dioxide: Comparative Study Using Solution Model and Other Approaches Jaw-Shin.
Linguistic summaries on relational databases Miroslav Hudec University of Economics in Bratislava, Department of Applied Informatics FSTA, 2014.
Representation of Fuzzy Knowledge in Relational Databases Authors: José Galindo ; Angélica Urrutia ; Mario Piattini Public:Database and Expert Systems.
About Me Swaroop Butala  MSCS – graduating in Dec 09  Specialization: Systems and Databases  Interests:  Learning new technologies  Application of.
Fuzzy Expert System n Introduction n Fuzzy sets n Linguistic variables and hedges n Operations of fuzzy sets n Fuzzy rules n Summary.
Lesson 4.5 – Conducting a Survey to Collect Two-Variable Data.
Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.
Lecture №4 METHODS OF RESEARCH. Method (Greek. methodos) - way of knowledge, the study of natural phenomena and social life. It is also a set of methods.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Chien-Shing Chen Author: Gustavo.
Advanced Science and Technology Letters Vol.28 (EEC 2013), pp Fuzzy Technique for Color Quality Transformation.
Multivariate Discretization of Continuous Variables for Set Mining Author:Stephen D. Bay Advisor: Dr. Hsu Graduate: Kuo-wei Chen.
Il-Ahn Cheong Linux Security Research Center Chonnam National University, Korea.
The inference and accuracy We learned how to estimate the probability that the percentage of some subjects in the sample would be in a given interval by.
Introduction of Fuzzy Inference Systems By Kuentai Chen.
Introductory Lecture. What is Discrete Mathematics? Discrete mathematics is the part of mathematics devoted to the study of discrete (as opposed to continuous)
FNA/Spring CENG 562 – Machine Learning. FNA/Spring Contact information Instructor: Dr. Ferda N. Alpaslan
Engaging Business Students with a T-Shirt Marketing Project Using Excel Georgette Nicolaides eCOTS 2016.
A Probabilistic Quantifier Fuzzification Mechanism: The Model and Its Evaluation for Information Retrieval Felix Díaz-Hemida, David E. Losada, Alberto.
Proof And Strategies Chapter 2. Lecturer: Amani Mahajoub Omer Department of Computer Science and Software Engineering Discrete Structures Definition Discrete.
Fuzzy Systems Simulation Session 5
Shortest Path Problem Under Triangular Fuzzy Neutrosophic Information
Introduction to Science: The Scientific Method
Boundary Element Analysis of Systems Using Interval Methods
Farzaneh Mirzazadeh Fall 2007
Correlation coefficient
Discriminative Frequent Pattern Analysis for Effective Classification
Seminar Title By Name of the Candidate A Seminar on
An Infant Facial Expression Recognition System Based on Moment Feature Extraction C. Y. Fang, H. W. Lin, S. W. Chen Department of Computer Science and.
Lecture # 2 MATHEMATICAL STATISTICS
Further Topics on Random Variables: 1
Presentation transcript:

1 Fuzzy Versus Quantitative Association Rules: A Fair Data-Driven Comparison Shih-Ming Bai and Shyi-Ming Chen Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan, R.O.C.

2 Outline 1. Introduction 2. A New Method for Automatically Constructing Concept Maps Based on Fuzzy Rules 3. An Example 4. Conclusions

3 1. Introduction The discovery of knowledge in databases, also called data mining, is a most promising and important research area. In data mining, association rules are often used to represent and identify dependencies between attributes in a database. In most real-life applications, databases contain many other values besides 0 and 1. Very common, for instance, are quantitative attributes such as age or income.

4 2. Association Rule Mining Table I and Table II presents what could happen if we replace the quantitative attributes in a small database by either binary or fuzzy attributes.

5

6

7

8

9 3. Experimental Approach A. Data Set: FAM95 FAM95.DAT contains data for the 63,756 families that were interviewed in the March 1995 Current Population Survey (CPS).

10 B. Data-Driven Partition: Fuzzy c-means algorithm Formula: m = 1:

11 m = 2: m = 3:

12 C. Comparing Association Rules They compare the rankings obtained by the quantitative and the fuzzy algorithm using the Spearman rank correlation coefficient

13 D. Quantitative Versus Fuzzy Association Rules Table III lists the 20 strongest rules obtained from the discrete (m = 1) and the fuzzy algorithm (m = 3) along with their confidence and support values.

14

15

16 4. Conclusion The typical argumentation or motivation for involving fuzzy set theory in association rule mining is as follows: 1) that it allows for the rules to be formulated using vague linguistic expressions, hence easier to grasp by humans; 2) that it suppresses the unwanted effect that boundary cases might cause.

17 But quantitative association rule mining also gives (the same strong) rules formulated in the same way in natural The sharp boundary problem is already inherently suppressed and can be further minimized by using sensible partitioning methods, as is already being done in quantitative association rule mining.

18 Hence, we may expect rules obtained using a data-driven approach to be significantly different from the rules obtained using an expert-driven approach. The comparison of fuzzy and quantitative association rules using an expert-driven approach (for large databases) is certainly an interesting topic for future research. In this case, however, experts should also define the crisp intervals that correspond best to human intuition! The common practice of comparing data-driven crisp data mining with expert-driven fuzzy data mining does not provide convincing arguments for the introduction of fuzzy association rules.

19 Thank You!