Clustering Pathways Using Graph Mining Approach Mahmud Shahriar Hossain Monika Akbar Pramodh Pochu Venkata Sesha Sanagavarapu.

Slides:



Advertisements
Similar presentations
1. Find the cost of each of the following using the Nearest Neighbor Algorithm. a)Start at Vertex M.
Advertisements

 Data mining has emerged as a critical tool for knowledge discovery in large data sets. It has been extensively used to analyze business, financial,
gSpan: Graph-based substructure pattern mining
Query Optimization of Frequent Itemset Mining on Multiple Databases Mining on Multiple Databases David Fuhry Department of Computer Science Kent State.
Mining Graphs.
Frequent Item Mining.
Association Analysis. Association Rule Mining: Definition Given a set of records each of which contain some number of items from a given collection; –Produce.
Data Mining Techniques So Far: Cluster analysis K-means Classification Decision Trees J48 (C4.5) Rule-based classification JRIP (RIPPER) Logistic Regression.
Data Mining Association Analysis: Basic Concepts and Algorithms Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction.
Association Rule Mining Zhenjiang Lin Group Presentation April 10, 2007.
Data Mining, Frequent-Itemset Mining
Association Analysis (7) (Mining Graphs)
Learning using Graph Mincuts Shuchi Chawla Carnegie Mellon University 1/11/2003.
1 Classification Using Statistically Significant Rules Sanjay Chawla School of IT University of Sydney (joint work with Florian Verhein and Bavani Arunasalam)
COM (Co-Occurrence Miner): Graph Classification Based on Pattern Co-occurrence Ning Jin, Calvin Young, Wei Wang University of North Carolina at Chapel.
6/23/2015CSE591: Data Mining by H. Liu1 Association Rules Transactional data Algorithm Applications.
Summarization of Frequent Pattern Mining. What is FPM? Why being frequent is so important? Application of FPM Decision make/Business Software Debugging.
© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.
English Food Vocab Words. Breakfast Lunch Dinner.
1 Data Mining, Database Tuning Tuesday, Feb. 27, 2007.
Graph Indexing: A Frequent Structure­ based Approach Authors:Xifeng Yan†, Philip S‡. Yu, Jiawei Han†
Storytelling and Clustering for Cellular Signaling Pathways M. Shahriar Hossain, Monika Akbar, Nicholas F. Polys Department of Computer Science, Virginia.
CSC 213 – Large Scale Programming. Today’s Goals  Make Britney sad through my color choices  Revisit issue of graph terminology and usage  Subgraphs,
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,
Efficient Data Mining for Calling Path Patterns in GSM Networks Information Systems, accepted 5 December 2002 SPEAKER: YAO-TE WANG ( 王耀德 )
Topological Analysis in PPI Networks & Network Motif Discovery Jin Chen MSU CSE Fall 1.
Shopping at the Supermarket Shopping List 1 bottle of milk 10 apples 1 watermelon 1 packet of noodles 2 tins of tuna fish chicken vegetables.
7.1 and 7.2: Spanning Trees. A network is a graph that is connected –The network must be a sub-graph of the original graph (its edges must come from the.
Supermarket shelf management – Market-basket model:  Goal: Identify items that are bought together by sufficiently many customers  Approach: Process.
Data Mining Association Analysis Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction to Data Mining 4/18/
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology CONTOUR: an efficient algorithm for discovering discriminating.
An Efficient Algorithm for Discovering Frequent Subgraphs Michihiro Kuramochi and George Karypis ICDM, 2001 報告者:蔡明瑾.
SPIN: Mining Maximal Frequent Subgraphs from Graph Databases Jun Huan, Wei Wang, Jan Prins, Jiong Yang KDD 2004.
Xiangnan Kong,Philip S. Yu Department of Computer Science University of Illinois at Chicago KDD 2010.
Graph Indexing: A Frequent Structure- based Approach Alicia Cosenza November 26 th, 2007.
Frequent Subgraph Discovery Michihiro Kuramochi and George Karypis ICDM 2001.
Frequent Item Mining. What is data mining? =Pattern Mining? What patterns? Why are they useful?
CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
Data Mining Association Rules: Advanced Concepts and Algorithms Lecture Notes Introduction to Data Mining by Tan, Steinbach, Kumar.
University at BuffaloThe State University of New York Lei Shi Department of Computer Science and Engineering State University of New York at Buffalo Frequent.
Challenges in Mining Large Image Datasets Jelena Tešić, B.S. Manjunath University of California, Santa Barbara
How many bottles do we need ? Let’s buy two bottles.
Overview Definition of Apriori Algorithm
Association Rules Carissa Wang February 23, 2010.
COMP53311 Association Rule Mining Prepared by Raymond Wong Presented by Raymond Wong
Indexing and Mining Free Trees Yun Chi, Yirong Yang, Richard R. Muntz Department of Computer Science University of California, Los Angeles, CA {
1 Data Mining Lecture 6: Association Analysis. 2 Association Rule Mining l Given a set of transactions, find rules that will predict the occurrence of.
juice פורית אברמוב cheese פורית אברמוב sandwich פורית אברמוב.
You have 10 seconds to name…
CSC 213 – Large Scale Programming Lecture 31: Graph Traversals.
Subgraph Search Over Uncertain Graphs Erşan Demircioğlu.
Introduction to Data Mining Mining Association Rules Reference: Tan et al: Introduction to data mining. Some slides are adopted from Tan et al.
ПЕЧЕНЬ 9. Закладка печени в период эмбрионального развития.
Spanning Trees Alyce Brady CS 510: Computer Algorithms.
Ning Jin, Wei Wang ICDE 2011 LTS: Discriminative Subgraph Mining by Learning from Search History.
Gspan: Graph-based Substructure Pattern Mining
CPT-S Advanced Databases 11 Yinghui Wu EME 49.
Презентацию подготовила Хайруллина Ч.А. Муслюмовская гимназия Подготовка к части С ЕГЭ.
Data Mining – Association Rules
Parametric calibration of speed–density relationships in mesoscopic traffic simulator with data mining Adviser: Yu-Chiang Li Speaker: Gung-Shian Lin Date:2009/10/20.
Data Mining 101 with Scikit-Learn
Waikato Environment for Knowledge Analysis
Parametric calibration of speed–density relationships in mesoscopic traffic simulator with data mining Adviser: Yu-Chiang Li Speaker: Gung-Shian Lin Date:2009/10/20.
Research Areas Christoph F. Eick
Mining Frequent Subgraphs
Data Science introduction.
Depth-First Search D B A C E Depth-First Search Depth-First Search
Graph Classification SEG 5010 Week 3.
Shopping for Food.
Line Graphs.
Presentation transcript:

Clustering Pathways Using Graph Mining Approach Mahmud Shahriar Hossain Monika Akbar Pramodh Pochu Venkata Sesha Sanagavarapu

2 Design Pipeline Preprocessor Frequent Subgraph Discovery Graph Objects of Pathways Mined Data Pathway Clustering STKE Dataset NN SearchPathway Relations

3 Dataset Properties (size)

4

5 pf-ipf (tf-idf) TransactionItems bought David Lopez Orange Juice (2), Potato chip (3), Pepsi (1) Robbie Lamb Potato chip (3), Pepsi (3), Beer (1) Jonathan Branden Potato chip (1), Pepsi (1) John Paxton Potato chip (2), Coconut Cookies (2), Pepsi (1) Rafal Angryk Swiss Army Knife (15) Jeannete Radclif Potato chip (2), Coconut Cookies (3) Rocky Ross Orange Juice (2), Coconut Cookies (3) Richard MaClaster Coconut Cookies (3), Beer (1) ………… ……………………………….

6 Dataset Properties (pf-ipf)

7

8 Subgraph Discovery k# of Subgraphs generated Time (sec.) 11,376Existing 25, , , , min_sup=2% What so novel about pruning edges?

9 Subgraph Discovery

10 Subgraph Discovery

11 Subgraph Discovery

12 Subgraph Discovery kNumber of Subgraphs Time Saved (%) Attempts Saved(%) Overall attempts saved = 89.52% Overall time saved = 99.39%

13 Clustering

14 Clustering

15 Nearest Neighbors Cover Tree and Brute-force method

16 Pathway Relations (StoryTelling) Bidirectional Search S p1p1 p2p2 p3p3 T p7p7 p8p8 p9p9

17 Pathway Relations (StoryTelling)

18 Pathway Relations (StoryTelling)

19 Pathway Relations (StoryTelling)

20 Questions ???