Department of Computer Science and Engineering (CSE), BUET

Slides:



Advertisements
Similar presentations
Association Rule and Sequential Pattern Mining for Episode Extraction Jonathan Yip.
Advertisements

Mining Association Rules from Microarray Gene Expression Data.
Frequent Itemset Mining Methods. The Apriori algorithm Finding frequent itemsets using candidate generation Seminal algorithm proposed by R. Agrawal and.
A distributed method for mining association rules
Data Mining Techniques Association Rule
Data Mining (Apriori Algorithm)DCS 802, Spring DCS 802 Data Mining Apriori Algorithm Spring of 2002 Prof. Sung-Hyuk Cha School of Computer Science.
Mining Multiple-level Association Rules in Large Databases
Association Rules Spring Data Mining: What is it?  Two definitions:  The first one, classic and well-known, says that data mining is the nontrivial.
IT 433 Data Warehousing and Data Mining Association Rules Assist.Prof.Songül Albayrak Yıldız Technical University Computer Engineering Department
Association Rule Mining. 2 The Task Two ways of defining the task General –Input: A collection of instances –Output: rules to predict the values of any.
10 -1 Lecture 10 Association Rules Mining Topics –Basics –Mining Frequent Patterns –Mining Frequent Sequential Patterns –Applications.
Nadia Andreani Dwiyono DESIGN AND MAKE OF DATA MINING MARKET BASKET ANALYSIS APLICATION AT DE JOGLO RESTAURANT.
Chase Repp.  knowledge discovery  searching, analyzing, and sifting through large data sets to find new patterns, trends, and relationships contained.
Sampling Large Databases for Association Rules ( Toivenon’s Approach, 1996) Farzaneh Mirzazadeh Fall 2007.
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining Techniques Cluster Analysis Induction Neural Networks OLAP Data Visualization.
Mining Frequent Itemsets from Uncertain Data Presented by Chun-Kit Chui, Ben Kao, Edward Hung Department of Computer Science, The University of Hong Kong.
Data Mining Techniques So Far: Cluster analysis K-means Classification Decision Trees J48 (C4.5) Rule-based classification JRIP (RIPPER) Logistic Regression.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Association Rule Mining Part 2 (under construction!) Introduction to Data Mining with Case Studies Author: G. K. Gupta Prentice Hall India, 2006.
Data Mining Association Analysis: Basic Concepts and Algorithms
Data Mining Association Analysis: Basic Concepts and Algorithms
1 ACCTG 6910 Building Enterprise & Business Intelligence Systems (e.bis) Association Rule Mining Olivia R. Liu Sheng, Ph.D. Emma Eccles Jones Presidential.
© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.
Research Project Mining Negative Rules in Large Databases using GRD.
Performance and Scalability: Apriori Implementation.
SEG Tutorial 2 – Frequent Pattern Mining.
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
Chapter 5 Mining Association Rules with FP Tree Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2010.
『 Data Mining 』 By Jung, hae-sun. 1.Introduction 2.Definition 3.Data Mining Applications 4.Data Mining Tasks 5. Overview of the System 6. Data Mining.
Mining Association Rules between Sets of Items in Large Databases presented by Zhuang Wang.
Apriori algorithm Seminar of Popular Algorithms in Data Mining and Machine Learning, TKK Presentation Lauri Lahti.
Ch5 Mining Frequent Patterns, Associations, and Correlations
1 Apriori Algorithm Review for Finals. SE 157B, Spring Semester 2007 Professor Lee By Gaurang Negandhi.
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining By Tan, Steinbach, Kumar Lecture.
Modul 7: Association Analysis. 2 Association Rule Mining  Given a set of transactions, find rules that will predict the occurrence of an item based on.
Efficient Data Mining for Calling Path Patterns in GSM Networks Information Systems, accepted 5 December 2002 SPEAKER: YAO-TE WANG ( 王耀德 )
1 Knowledge Discovery Transparencies prepared by Ho Tu Bao [JAIST] ITCS 6162.
Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
Detecting Group Differences: Mining Contrast Sets Author: Stephen D. Bay Advisor: Dr. Hsu Graduate: Yan-Cheng Lin.
Mining Frequent Itemsets from Uncertain Data Presenter : Chun-Kit Chui Chun-Kit Chui [1], Ben Kao [1] and Edward Hung [2] [1] Department of Computer Science.
Using Frequent Pattern Mining to Find Co-mutated Genes in Breast Cancer Zachary Stanfield 4/7/2015.
Association Rule Mining
Mining Frequent Patterns, Associations, and Correlations Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
A Scalable Association Rules Mining Algorithm Based on Sorting, Indexing and Trimming Chuang-Kai Chiou, Judy C. R Tseng Proceedings of the Sixth International.
Data Mining Association Rules Mining Frequent Itemset Mining Support and Confidence Apriori Approach.
1 Data Mining Lecture 6: Association Analysis. 2 Association Rule Mining l Given a set of transactions, find rules that will predict the occurrence of.
1 Mining the Smallest Association Rule Set for Predictions Jiuyong Li, Hong Shen, and Rodney Topor Proceedings of the 2001 IEEE International Conference.
1 Top Down FP-Growth for Association Rule Mining By Ke Wang.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
1-1 TITLE PRESENTATION:HEALTHCARE GROUP MEMBER: CHUAH XUE LI(212176) ONG SEAT NEE(212133) STIN2063 MACHINE LEARNING.
Oracle Advanced Analytics
Data Mining Association Analysis: Basic Concepts and Algorithms
Frequent Pattern Mining
ALZHEIMER DISEASE PREDICTION USING DATA MINING TECHNIQUES P.SUGANYA (RESEARCH SCHOLAR) DEPARTMENT OF COMPUTER SCIENCE TIRUPPUR KUMARAN COLLEGE FOR WOMEN.
William Norris Professor and Head, Department of Computer Science
Waikato Environment for Knowledge Analysis
Chapter 6 Tutorial.
Market Basket Analysis and Association Rules
Association Rule Mining
Data Mining Association Analysis: Basic Concepts and Algorithms
DIRECT HASHING AND PRUNING (DHP) ALGORITHM
Data Mining Association Rules Assoc.Prof.Songül Varlı Albayrak
Association Rule Mining
A Parameterised Algorithm for Mining Association Rules
Market Basket Analysis and Association Rules
Somi Jacob and Christian Bach
Presentation transcript:

Department of Computer Science and Engineering (CSE), BUET Recurrent Breast Cancer Detection Based on Association Rules Using Frequent Itemset Mining Md. Samiul Saeef, Md. Siddiqur Rahman Introduction Interestingness Criteria Breast cancer is one of the leading cancers for women when compared to all other cancers. It is the second most common cause of cancer death in women. It often recurs anywhere from 2 to 15 years following initial treatment. Data mining methods can help to successfully detect breast cancer recurrence. Objective Our research aims at helping medical experts in recurrent breast cancer detection by providing strong rules extracted from cancer patient database. We use Apriori algorithm for frequent itemset mining in order to discover these strong association rules. Experimental Setup Dataset: The dataset for this work is collected from UCI Machine Learning Repository. There are total 10 variables, and 286 records of patients were created for the analysis. Tool: WEKA 3.6.10 has been used to explore the behavior of the Apriori algorithm for finding the association rules. The .csv file are converted into .arff file, which is the acceptable format for WEKA tool. Minimum support defined by the tool for the generated rule is 0.1. About Breast Cancer Inside a woman's breast are 15 to 20 sections, or lobes. Each lobe is made of many smaller sections called lobules. Fibrous tissue and fat fill the spaces between the lobules and ducts (thin tubes that connect the lobes and nipples [Fig. 1]). Breast cancer occurs when cells in the breast grow out of control and form a growth or tumor. Tumors may be cancerous (malignant) or benign. Experimental Result Some association rules for detecting recurrent breast cancer of the breast cancer patients are mentioned below, and visual form of breast cancer using all attributes is presented in the graphical form in Fig. 3 Fig. 1: Normal Breast tissue Recurrent breast cancer is breast cancer that comes back after initial treatment. Although the initial treatment is aimed at eliminating all cancer cells, a few may have evaded treatment and survived. These undetected cancer cells multiply, becoming recurrent breast cancer. Fig. 3: Visual form of breast cancer using all attributes. Rule-1: menopause=ge40 inv-nodes=0-2 deg-malig=1 irradiat=no 30 ==> Class=no-recurrence-events 29 conf:(0.97) Rule-2: menopause=ge40 deg-malig=1 irradiat=no 30 ==> inv-nodes=0-2 Rule-3: node-caps=yes 56 ==> Class=recurrence-events 31 conf:(0.55) Rule-4: deg-malig=3 85 ==> Class=recurrence-events 45 conf:(0.53) Association Rule Mining Association rules are useful for analyzing and predicting the future event. Apriori Algorithm: The Apriori is a classic algorithm for frequent item set mining and association rule learning over the transactional databases . Association rules mining using Apriori algorithm uses a “bottom-up” approach, breadth-first search, and a hash tree structure to count the candidate item sets efficiently. A two-step Apriori algorithm is explained with the help of flowchart as shown in Fig. 2, and the algorithm is mentioned below: Step 1: Initially scan DB once to get frequent 1-itemset Step 2: Gene rate length (k + 1) candidate itemsets from length k frequent itemsets Step 3: Test the candidates against DB Step 4: Terminate when no frequent or candidate set can be generated Future Work Applying data mining methods in large datasets with numerous patient attributes so that a good number of significant rules can be extracted predicting recurrence in breast cancer more accurately. Conclusion In our research we developed a prediction model for recurrent breast cancer. Specifically, we used a popular data mining method: frequent itemset mining. References Chaurasia, Vikas, and Saurabh Pal. "Data Mining Techniques: To Predict and Resolve Breast Cancer Survivability." (IJCSMC) International Journal of Computer Science and Mobile Computing 3.1 (2014): 10-22. Sharma, Neha, and Hari Om. "Significant Patterns Extraction to Find Most Effective Treatment for Oral Cancer Using Data Mining." Systems Thinking Approach for Social Problems. Springer India, 2015. 385-396. Fig. 2: Flowchart of Apriori Algorithm Department of Computer Science and Engineering (CSE), BUET