The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities 1 Semantic Network.

Slides:



Advertisements
Similar presentations
Performance in Decentralized Filesharing Networks Theodore Hong Freenet Project.
Advertisements

TU/e technische universiteit eindhoven Hera: Development of Semantic Web Information Systems Geert-Jan Houben Peter Barna Flavius Frasincar Richard Vdovjak.
Emergence of Scaling in Random Networks Albert-Laszlo Barabsi & Reka Albert.
GridVine: Building Internet-Scale Semantic Overlay Networks By Lan Tian.
CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.
Cognitive Publish/Subscribe for Heterogeneous Clouds Šarūnas Girdzijauskas, Swedish Institute of Computer Science (SICS) Joint work with:
©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab The Chatty Web: Emergent Semantics Through Gossiping WWW2003 Karl Aberer,
1 Publishing Linked Sensor Data Semantic Sensor Networks Workshop 2010 In conjunction with the 9th International Semantic Web Conference (ISWC 2010), 7-11.
Information Networks Small World Networks Lecture 5.
4. PREFERENTIAL ATTACHMENT The rich gets richer. Empirical evidences Many large networks are scale free The degree distribution has a power-law behavior.
Topology Generation Suat Mercan. 2 Outline Motivation Topology Characterization Levels of Topology Modeling Techniques Types of Topology Generators.
Scale-free networks Péter Kómár Statistical physics seminar 07/10/2008.
Probabilistic Message Passing in Peer Data Management Systems Philippe Cudré-Mauroux, Karl Aberer EPFL Andras Feher, T.U. Darmstadt.
Using Structure Indices for Efficient Approximation of Network Properties Matthew J. Rattigan, Marc Maier, and David Jensen University of Massachusetts.
The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities 1 ICDE
The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities 1 MICS Scientific.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
The Data Ring: Community Content Sharing Serge Abiteboul (INRIA) Alkis Polyzotis (UC Santa Cruz)
©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab The Chatty Web approach for global semantic agreements MMGPS Workshop,
Common Properties of Real Networks. Erdős-Rényi Random Graphs.
Large-Scale Organization of Semantic Networks Mark Steyvers Josh Tenenbaum Stanford University.
CSE 222 Systems Programming Graph Theory Basics Dr. Jim Holten.
ODBASE A Necessary Condition for Semantic Interoperability in the Large Philippe Cudré-Mauroux and Karl Aberer School of Computer and Communication.
1 ISWC GridVine: Building Internet-Scale Semantic Overlay Networks Karl Aberer, Philippe Cudré-Mauroux, Manfred Hauswirth School of Computer.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities 1 DB Berkeley.
1 Unique identifiers for the Web Zoltan Miklos Joint work with Gleb Skobeltsyn, Saket Sathe, Nicolas Bonvin, Philippe Cudré-Mauroux, Ekaterini Ioannou,
ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING.
Adaptive Hypermedia Meets Provenance Evgeny Knutov Paul De Bra Mykola Pechenizkiy GAF project: Generic Adaptation Framework (project is supported byNWO.
The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities 1 IBM T.J. Watson.
Computer Science 1 Web as a graph Anna Karpovsky.
Peer-to-Peer and Social Networks Random Graphs. Random graphs E RDÖS -R ENYI MODEL One of several models … Presents a theory of how social webs are formed.
Random Graph Models of Social Networks Paper Authors: M.E. Newman, D.J. Watts, S.H. Strogatz Presentation presented by Jessie Riposo.
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
(Social) Networks Analysis III Prof. Dr. Daning Hu Department of Informatics University of Zurich Oct 16th, 2012.
Analysis and Modeling of the Open Source Software Community Yongqin Gao, Greg Madey Computer Science & Engineering University of Notre Dame Vincent Freeh.
Topic 13 Network Models Credits: C. Faloutsos and J. Leskovec Tutorial
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
Link Recommendation In P2P Social Networks Yusuf Aytaş, Hakan Ferhatosmanoğlu, Özgür Ulusoy Bilkent University, Ankara, Turkey.
Author: M.E.J. Newman Presenter: Guoliang Liu Date:5/4/2012.
Social scope: Enabling Information Discovery On Social Content Sites
1 Applications of Relative Importance  Why is relative importance interesting? Web Social Networks Citation Graphs Biological Data  Graphs become too.
An affinity-driven clustering approach for service discovery and composition for pervasive computing J. Gaber and M.Bakhouya Laboratoire SeT Université.
Peer-to-Peer Data Integration Using Distributed Bridges Neal Arthorne B. Eng. Computer Systems (2002) Supervisor: Babak Esfandiari April 12, 2005 Candidate.
Data Analysis in YouTube. Introduction Social network + a video sharing media – Potential environment to propagate an influence. Friendship network and.
Master Thesis Defense Jan Fiedler 04/17/98
A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.
Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch.
Information System Development Courses Figure: ISD Course Structure.
Understanding Crowds’ Migration on the Web Yong Wang Komal Pal Aleksandar Kuzmanovic Northwestern University
Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent W. Freeh Dr. Kevin Bowyer Supported in part by the National Science.
CS315-Web Search & Data Mining. A Semester in 50 minutes or less The Web History Key technologies and developments Its future Information Retrieval (IR)
Analyzing the Vulnerability of Superpeer Networks Against Attack Niloy Ganguly Department of Computer Science & Engineering Indian Institute of Technology,
1. 2 CIShell Features A framework for easy integration of new and existing algorithms written in any programming language. CIShell Sci2 Tool NWB Tool.
Mining real world data Web data. World Wide Web Hypertext documents –Text –Links Web –billions of documents –authored by millions of diverse people –edited.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Clusters Recognition from Large Small World Graph Igor Kanovsky, Lilach Prego Emek Yezreel College, Israel University of Haifa, Israel.
Brief Announcement : Measuring Robustness of Superpeer Topologies Niloy Ganguly Department of Computer Science & Engineering Indian Institute of Technology,
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
The Chatty Web : Emergent Semantics Through Gossiping Karl Aberer, Philippe Cudre-Mauroux, Manfred Hauswirth Presented by Yookyung Jo.
Importance Measures on Nodes Lecture 2 Srinivasan Parthasarathy 1.
Hierarchical Organization in Complex Networks by Ravasz and Barabasi İlhan Kaya Boğaziçi University.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
GRAPH AND LINK MINING 1. Graphs - Basics 2 Undirected Graphs Undirected Graph: The edges are undirected pairs – they can be traversed in any direction.
The Consistency and Conformance of Web Document Collection Based on Heterogeneous DAC Graph Marek Kopel and Aleksander Zgrzywa
Semantic Graph Mining for Biomedical Network Analysis: A Case Study in Traditional Chinese Medicine Tong Yu HCLS
NSI Topology Thoughts on how topology fits into the NSI architecture
CHAPTER 3 Architectures for Distributed Systems
Empirical analysis of Chinese airport network as a complex weighted network Methodology Section Presented by Di Li.
Associative Query Answering via Query Feature Similarity
Presentation transcript:

The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities 1 Semantic Network Analysis Analyzing Semantic Interoperability in Bioinformatic Database Networks Philippe Cudré-Mauroux, EPFL Joint work with: Julien Gaugaz, Adriana Budura and Karl Aberer

The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities 2 Overview 1.Peer Data Management Systems (PDMS) 2.Semantic Interoperability in the Large Generatingfunctionologic framework 3.The Sequence Retrieval System Degree distribution Analysis of giant component Weighted analysis 4.Conclusions

The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities 3 Beyond Keyword Search  searching semantically richer objects in large scale heterogeneous networks T18:49:03Z T20:09:28Z date? 05/08/2004 Jan 1, 2005 ? ? ? ? ?

The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities 4 Decentralized Data Integration Large Scale Information Systems (e.g., WWW) –Number of sources > 100 –Unreliable data Autonomy –Semi-structured data E.g., XML/RDF –No integrity constraints –No transactions –Simple SP queries E.g., triple patterns, ranking –Schemata created by end users –Network churn Distributed Databases –Number of sources < 100 –Consistent data Coordination –Structured data E.g., Relational data model –Integrity constraints –Transactions –Powerful queries E.g., SQL, aggregation –Schemas created by administrators –Relatively Fixed topology VS

The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities 5 Data Integration: LAV/GAV Traditional database techniques (e.g., LAV/GAV) rely on centralized schemas to integrate data sources Not applicable to our context –Scale (upper ontologies?) –Churn –Autonomy How can we foster semantic interoperability in decentralized settings? Date myDate yourDate m(Date) = yourDate m(Date) = myDate

The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities 6 Semantic Interoperability Q1= $p/GUID FOR $p IN /Photoshop_Image WHERE $p/Creator LIKE "%Robi%" 178A8CD8865 Robinson Tunbridge Wells Royal Council … Photoshop (own schema) 178A8CD8866 Henry Peach Robinson Photographer Tunbridge Council … WinFS (known schema ) T12 = $fs/GUID $fs/Author/DisplayName FOR $fs IN /WinFSImage Q2= $p/GUID FOR $p IN T12 WHERE $p/Creator LIKE "%Robi%"  Extending semantic interoperability techniques to decentralized settings

The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities 7 1. Peer Data Management Systems Pairwise mappings –Peer Data Management Systems (PDMS) Local mappings overcome global heterogeneity –Iterative query rewriting T18:49:03Z T20:09:28Z date? 05/08/2004 Jan 1, 2005 article weather es:cDate  xap:CreateDate es:cDate  myRDF :Date myRDF: Date  xap:ModifyDate

The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities 8 Semantic Mediation Layer Correlated / Uncorrelated Correlated / Uncorrelated “Physical” layer Overlay Layer Semantic Mediation Layer

The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities 9 Schema-to-Schema Graph Inter-organization of the different schemas used by the peers - Logical model - Directed - Weighted - Redundant

The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities 10 The Semantic Connectivity Graph Definition (Semantic Interoperability) Two peers are said to be semantically interoperable if they can forward queries to each other in the Schema-to-Schema graph, potentially through series of semantic translation links Idea –As for physical network analyses, create a connectivity layer to account for semantic interoperability The semantic connectivity Graph S –Unweighted, irreflexive and non-redundant version of the Schema-to- Schema graph

The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities 11 Observations Theorem Peers in a set P s are semantically interoperable iff S s is strongly connected, with S s  {s |  p  P s, p  s} Observation 1 A set of peers P s cannot be semantically interoperable if |E s | < |V s | Observation 2 A set of peers P s is semantically interoperable if |E s | > |V s | (|V s |-1) - (|V s |-1)

The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities Semantic Interoperability in the Large Question –How can we analyze semantic interoperability in large-scale PDMS? Idea: use percolation theory to detect the emergence of a strongly connected component in S –Necessary condition for vertex-strong connectivity –Necessary condition for semantic interoperability

The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities 13 The Model Adaptation of a recent graph-theoretic framework –Newman, Strogatz, Watts 2001 Large-scale semantic graphs as random graphs with arbitrary degree distribution –Exponentially distributed, small-world, scale-free… graphs Specificities of our model –Strong clustering (clustering coefficient cc) –Bidirectionality (bidirectionality coefficient bc) (for directed networks) Based on generatingfunctionology – Percolation: ci > 0

The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities 14 Size of the giant component With u the smallest non-negative solution of And G 1 the distribution of edges from first to second- order neighbors:

The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities The Sequence Retrieval System (SRS) Commercial information indexing and retrieval system Bioinformatic libraries –EMBL –SwissProt –Prosite –Etc. Schemas described in a custom language (Icarus) Mappings (links) from one database to others

The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities 16 Why is SRS interesting? Applying our heuristics on a real large-scale corpus of interconnected databases –More than 380 databanks –More than 500 (undirected) links –Data used by professionals on a daily basis

The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities 17 Crawling the SRS schema-to-schema graph Custom crawler As of May 2005 (EBI repository) –388 nodes –518 edges –Giant connected component: 187 nodes –Power-law distribution of node degrees –Clustering coefficient = 0.32 –Diameter = 9

The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities 18 Results Connectivity indicator ci = 25.4 –Super-critical state Size of the giant component –0.47 (derived) –0.48 (observed)

The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities 19 Graphs with same power-law degree distr. Varying number of edges

The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities 20 10x Bigger Graph

The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities 21 Analyzing weighted networks Do we have a sufficient number of good mappings? Introducing quality measures from the mappings –Weights –Attribute / schema level –Cf. Chatty Web (WWW03) Semantic query forwarding –Per-hop forwarding behaviors –Only forward if w i >=   = 0 : flooding  = 1 : exact answers

The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities 22 Weighted Results Same degree distribution (388 nodes) Uniformly distributed weights between 0 and 1

The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities Conclusions Analyzing a real network of bioinformatic databases –Accurate results (even for relatively small networks) –Weighted / unweighted Current works –Compositions of weights along a path –Semantic random walkers –Public domain simulator Future works –Analyzing other forwarding behaviors –Implementation in a real PDMS (self-organizing mappings) GridVine

The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities 24 References A Necessary Condition for Semantic Interoperability in the Large Philippe Cudré-Mauroux and Karl Aberer ODBASE 2004 GridVine: Building Internet-Scale Semantic Overlay Networks Karl Aberer, Philippe Cudré-Mauroux and Tim van Pelt ISWC 2004 Semantic Overlay Networks (Tutorial) Karl Aberer and Philippe Cudré-Mauroux VLDB 2005 … complete reference list at

The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities 25 Thank you for your attention Questions ?