Concept Switching Azadeh Shakery. Concept Switching: Problem Definition C1C2Ck …

Slides:



Advertisements
Similar presentations
CSE 211 Discrete Mathematics
Advertisements

The overlapping community structure of complex networks.
CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.
Mauro Sozio and Aristides Gionis Presented By:
Analysis and Modeling of Social Networks Foudalis Ilias.
Community Detection Algorithm and Community Quality Metric Mingming Chen & Boleslaw K. Szymanski Department of Computer Science Rensselaer Polytechnic.
Data Structure and Algorithms (BCS 1223) GRAPH. Introduction of Graph A graph G consists of two things: 1.A set V of elements called nodes(or points or.
Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.
Weighted networks: analysis, modeling A. Barrat, LPT, Université Paris-Sud, France M. Barthélemy (CEA, France) R. Pastor-Satorras (Barcelona, Spain) A.
1 Evolution of Networks Notes from Lectures of J.Mendes CNR, Pisa, Italy, December 2007 Eva Jaho Advanced Networking Research Group National and Kapodistrian.
Small-World Graphs for High Performance Networking Reem Alshahrani Kent State University.
DATA MINING LECTURE 12 Link Analysis Ranking Random walks.
The structure of the Internet. How are routers connected? Why should we care? –While communication protocols will work correctly on ANY topology –….they.
Routing - I Important concepts: link state based routing, distance vector based routing.
Using Structure Indices for Efficient Approximation of Network Properties Matthew J. Rattigan, Marc Maier, and David Jensen University of Massachusetts.
University of CreteCS4831 The use of Minimum Spanning Trees in microarray expression data Gkirtzou Ekaterini.
Design Patterns for Efficient Graph Algorithms in MapReduce Jimmy Lin and Michael Schatz University of Maryland Tuesday, June 29, 2010 This work is licensed.
Graphs and Topology Yao Zhao. Background of Graph A graph is a pair G =(V,E) –Undirected graph and directed graph –Weighted graph and unweighted graph.
Hypercubes and Neural Networks bill wolfe 10/23/2005.
HCC class lecture 22 comments John Canny 4/13/05.
TECH Computer Science Graph Optimization Problems and Greedy Algorithms Greedy Algorithms  // Make the best choice now! Optimization Problems  Minimizing.
Minimum Spanning Trees What is a MST (Minimum Spanning Tree) and how to find it with Prim’s algorithm and Kruskal’s algorithm.
Clustering Unsupervised learning Generating “classes”
Models of Influence in Online Social Networks
Graphs and Sets Dr. Andrew Wallace PhD BEng(hons) EurIng
Bump Hunting The objective PRIM algorithm Beam search References: Feelders, A.J. (2002). Rule induction by bump hunting. In J. Meij (Ed.), Dealing with.
The Erdös-Rényi models
Topic 13 Network Models Credits: C. Faloutsos and J. Leskovec Tutorial
Community detection algorithms: a comparative analysis Santo Fortunato.
1 ELEC692 Fall 2004 Lecture 1b ELEC692 Lecture 1a Introduction to graph theory and algorithm.
Boundary Recognition in Sensor Networks by Topology Methods Yue Wang, Jie Gao Dept. of Computer Science Stony Brook University Stony Brook, NY Joseph S.B.
A Clustering Algorithm based on Graph Connectivity Balakrishna Thiagarajan Computer Science and Engineering State University of New York at Buffalo.
Vertices and Edges Introduction to Graphs and Networks Mills College Spring 2012.
A Graph-based Friend Recommendation System Using Genetic Algorithm
Structural Properties of Networks: Introduction Networked Life NETS 112 Fall 2015 Prof. Michael Kearns.
Chapter 3. Community Detection and Evaluation May 2013 Youn-Hee Han
Union-find Algorithm Presented by Michael Cassarino.
Neural Network of C. elegans is a Small-World Network Masroor Hossain Wednesday, February 29 th, 2012 Introduction to Complex Systems.
Slides are modified from Lada Adamic
Network Community Behavior to Infer Human Activities.
COMMUNITY DISCOVERY PART 1: A (BRIEF) INTRODUCTION Giulio Rossetti WMA - 4 May 2015.
Graphs A graphs is an abstract representation of a set of objects, called vertices or nodes, where some pairs of the objects are connected by links, called.
Community Discovery in Social Network Yunming Ye Department of Computer Science Shenzhen Graduate School Harbin Institute of Technology.
CS 590 Term Project Epidemic model on Facebook
1 Finding Spread Blockers in Dynamic Networks (SNAKDD08)Habiba, Yintao Yu, Tanya Y., Berger-Wolf, Jared Saia Speaker: Hsu, Yu-wen Advisor: Dr. Koh, Jia-Ling.
Community structure in graphs Santo Fortunato. More links “inside” than “outside” Graphs are “sparse” “Communities”
Informatics tools in network science
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
Importance Measures on Nodes Lecture 2 Srinivasan Parthasarathy 1.
Network applications Sushmita Roy BMI/CS 576 Dec 9 th, 2014.
GRAPH AND LINK MINING 1. Graphs - Basics 2 Undirected Graphs Undirected Graph: The edges are undirected pairs – they can be traversed in any direction.
A Place-based Model for the Internet Topology Xiaotao Cai Victor T.-S. Shi William Perrizo NDSU {Xiaotao.cai, Victor.shi,
GUILLOU Frederic. Outline Introduction Motivations The basic recommendation system First phase : semantic similarities Second phase : communities Application.
Algorithms and Computational Biology Lab, Department of Computer Science and & Information Engineering, National Taiwan University, Taiwan Modular organization.
Graph clustering to detect network modules
Structural Properties of Networks: Introduction
Groups of vertices and Core-periphery structure
Structural Properties of Networks: Introduction
Routing Protocols and Concepts
Department of Computer and IT Engineering University of Kurdistan
Community detection in graphs
Structural Properties of Networks: Introduction
Network Science: A Short Introduction i3 Workshop
Resolution Limit in Community Detection
Link-State Routing Protocols
Department of Computer Science University of York
CSE572, CBS572: Data Mining by H. Liu
Link-State Routing Protocols
Graph and Link Mining.
CSE572: Data Mining by H. Liu
Presentation transcript:

Concept Switching Azadeh Shakery

Concept Switching: Problem Definition C1C2Ck …

Past Work: A Programming Language for Mining Fuzzy ER Graphs ForagerRover bee fly g1 g2 g3 Behavior Term gene1gene2 …

Past Work: A Programming Language for Mining Fuzzy ER Graphs Operators: –Neighbor Finding: NBSet WNBSet –Path Finding: Shortestpath Wpath –Set Operators: Union Intersect Cardinality topk Added Features –Type Definition –Function Definition –Seq. Operators Project Reverse Seq2Set Aggregate

Past Work: High Level Scripts for Entity Comparison Based on intersection and union of neighbors: NB(e1)  NB(e2) / NB(e1)  NB(e2) –Tehran, Iran: 27/52 –Baghdad, Iran: 11/52 –Washington, Iran: 0 Based on the shortest path between the two entities –gpcr__g_protein__plc__diacylglycerol –bush__leader__khomeini Based on the length of the shortest path to a base entity Connection to a center node NB(e)  NB(c) / NB(e)  NB(c)

Current Work: Topic/Concept Map Alternative way of accessing information Create an index of information which resides outside that information The topic map describes the information in the documents and databases

Multi-Resolution Topic Maps WORDS Word Net High Level Concepts Low resolution High resolution

Multi-Resolution Topic Map Static –Discrete Navigation –Challenges: Define resolution Community finding algorithm Summarize Communities Define distance between communities Between which communities do we allow the navigation? Dynamic –Continuous Navigation –Challenge: Define Resolution Online community finding algorithm Summarize communities

Challenges Resolution definition – : Resolution –{C 1, C 2, …, C k }: Communities at this level –One way is to define as the link strength threshold –  0 : all links,   : No links Community finding algorithm Community distance: –C1, C2 , Similarity(C1, C2) =? |C1  C2| / |C1  C2| Works if communities are allowed to have intersection Community summarization Low resolution Low threshold High resolution High threshold

Community Summarization Use the documents to do the summarization Summarize based on the community nodes –Define center nodes to do the summarization: Based on the average MI distance to the other nodes in the community –Slow on very large communities Based on the degree of the nodes –Counts all neighbors as equally important Based on a PageRank like algorithm: –Each node has a centrality value –In each step, each node distributes its centrality to its neighbors proportional to the strength of the link –Do this iteratively until the centrality values converge

Community Finding Algorithms: Newman’s Algorithm Newman’s algorithm for detecting community structure in networks: –Modularity: A measure of the quality of a particular division of a network –Modularity measure measures the fraction of the edges in the network that connect vertices of the same type (within community) minus the expected value of the same quantity in the same network with random connections –Consider different divisions of the graph to communities and find the community which maximizes the modularity measure –The number of distinct community divisions grows exponentially in the number of nodes –They use a greedy algorithm to solve the problem –The algorithm is of O((m + n)n)

Newman’s Algorithm Communities are of very different sizes –A few very large communities and a lot of small communities No overlapping communities –Definition of neighbor communities is hard Experiments on bee data: –1200 records about apis mellifera (honey bee) –Thr = Results

Community Finding Algorithms: CPM Clique Percolation Method (CPM) –Locates the kclique communities of unweighted, undirected networks. –Observation: A typical member in a community is linked to many other members, but not necessarily to all other nodes. –A community can be interpreted as a union of smaller complete subgraphs that share nodes. –k-clique community is defined as the union of all k- cliques that can be reached from each other through a series of adjacent k-cliques. –Two k-cliques are said to be adjacent if they share k-1 nodes.

Properties of CPM Not too restrictive (compared to cliques) Based on the density of links Local Does not yield cut-nodes or cut-links (whose removal would disjoin the community) Allows overlaps

Results thr = 0.05 –228 nodes –1197 edges –CPM: 0 min secNewman: 0 min 0.11 sec –16 communities of more that one nodes thr = 0.04 –312 nodes –1483 edges –CPM: 0 min secNewman: 0 min 0.21 sec –20 communities of more than one nodes thr = 0.03 –507 nodes –2924 edges –CPM: 0 min secNewman: 0 min 0.49 sec –29 communities of more than one node thr = 0.01 –4349 nodes –28595 edges –CPM: 5 min secNewman: 1 min sec –103 communities of more than one node

Sample of Resolution Change neural nervous coordination brain proboscis extension conditioning learning system mushroom Homeostasis olfactory juvenile hormone endocrine bodies antennal conditioned chemical reflex proboscis extension conditioning learning conditioned olfactory reflex neural Nervous coordination brain system mushroom bodies neurons homeostasis chemical coordination juvenile hormone jh endocrine

Concept Switching Construct a topic map for each collection separately Construct one universal topic map

Discussion Better ideas for community summarization? Dynamic via static topic maps? Alternative ways of defining resolution

Thank you Questions?