©2004, Philippe Cudré-Mauroux Exploiting Localized Metadata in Decentralized Settings Microsoft Research Asia 09.29.04 Philippe Cudré-Mauroux Distributed.

Slides:



Advertisements
Similar presentations
Can I Use It, and If so, How? Christian Lieske SAP AG – MultiLingual Technology Discussion of Consortium Proposal for OLIF2 File Header.
Advertisements

Intelligent Technologies Module: Ontologies and their use in Information Systems Revision lecture Alex Poulovassilis November/December 2009.
Schema Matching and Query Rewriting in Ontology-based Data Integration Zdeňka Linková ICS AS CR Advisor: Július Štuller.
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
A Stepwise Modeling Approach for Individual Media Semantics Annett Mitschick, Klaus Meißner TU Dresden, Department of Computer Science, Multimedia Technology.
GridVine: Building Internet-Scale Semantic Overlay Networks By Lan Tian.
©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab The Chatty Web: Emergent Semantics Through Gossiping WWW2003 Karl Aberer,
MobiShare: Sharing Context-Dependent Data & Services from Mobile Sources Efstratios Valavanis, Christopher Ververidis, Michalis Vazirgianis, George C.
0 General information Rate of acceptance 37% Papers from 15 Countries and 5 Geographical Areas –North America 5 –South America 2 –Europe 20 –Asia 2 –Australia.
©2004, Philippe Cudré-Mauroux Sharing Pictures in Peer-DBMS MSRA, Image Retrieval Meeting Philippe Cudré-Mauroux Distributed Information Systems Laboratory.
©2004, Philippe Cudré-Mauroux Semantic Interoperability for Global Information Systems Microsoft Research Asia Philippe Cudré-Mauroux Distributed.
The National Centres of Competence in Research are managed by the Swiss National Science Foundation on behalf of the Federal Authorities 1 MICS Scientific.
Dynamic Ontologies on the Web Jeff Heflin, James Hendler.
©2003, Philippe Cudré-Mauroux, EPFL-I&C-IIF, Distributed Information Systems Lab The Chatty Web approach for global semantic agreements MMGPS Workshop,
COMP 6703 eScience Project Semantic Web for Museums Student : Lei Junran Client/Technical Supervisor : Tom Worthington Academic Supervisor : Peter Strazdins.
An Architecture for Creating Collaborative Semantically Capable Scientific Data Sharing Infrastructures Anuj R. Jaiswal, C. Lee Giles, Prasenjit Mitra,
Architecture & Data Management of XML-Based Digital Video Library System Jacky C.K. Ma Michael R. Lyu.
ODBASE A Necessary Condition for Semantic Interoperability in the Large Philippe Cudré-Mauroux and Karl Aberer School of Computer and Communication.
1 ISWC GridVine: Building Internet-Scale Semantic Overlay Networks Karl Aberer, Philippe Cudré-Mauroux, Manfred Hauswirth School of Computer.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
RDF: Building Block for the Semantic Web Jim Ellenberger UCCS CS5260 Spring 2011.
WRAP Technical Support System Project Update AoH Call October 19, 2005.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
IBM User Technology March 2004 | Dynamic Navigation in DITA © 2004 IBM Corporation Dynamic Navigation in DITA Erik Hennum and Robert Anderson.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Digital Object: A Virtual Online Storage Solution 598C Course Project Huajing Li.
Semantic Interoperability Jérôme Euzenat INRIA & LIG France Natasha Noy Stanford University USA.
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
1/ 27 The Agriculture Ontology Service Initiative APAN Conference 20 July 2006 Singapore.
RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.
2005 Adobe Systems Incorporated. All Rights Reserved. 1 Ontolog Forum Gunar Penikis Sr. Product Manager Adobe Systems.
Data Exchange Tools (DExT) DExT PROJECTAN OPEN EXCHANGE FORMAT FOR DATA enables long-term preservation and re-use of metadata,
Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem.
Digital Object Architecture
1 SAMT’08 Semantic-driven multimedia retrieval with the MPEG Query Format Ruben Tous and Jaime Delgado Distributed Multimedia Applications Group (DMAG)
Peer-to-Peer Data Integration Using Distributed Bridges Neal Arthorne B. Eng. Computer Systems (2002) Supervisor: Babak Esfandiari April 12, 2005 Candidate.
PART IV: REPRESENTING, EXPLAINING, AND PROCESSING ALIGNMENTS & PART V: CONCLUSIONS Ontology Matching Jerome Euzenat and Pavel Shvaiko.
The MMI Tools Carlos Rueda Monterey Bay Aquarium Research Institute OOS Semantic Interoperability Workshop Marine Metadata Interoperability Project Boulder,
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Integrated Collaborative Information Systems Ahmet E. Topcu Advisor: Prof Dr. Geoffrey Fox 1.
Ontologies and Lexical Semantic Networks, Their Editing and Browsing Pavel Smrž and Martin Povolný Faculty of Informatics,
Ontology Summit 2015 Track C Report-back Summit Synthesis Session 1, 19 Feb 2015.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Ocean Observatories Initiative Data Management (DM) Subsystem Overview Michael Meisinger September 29, 2009.
MIND: An architecture for multimedia information retrieval in federated digital libraries Henrik Nottelmann University of Dortmund, Germany.
LOGO A comparison of two web-based document management systems ShaoxinYu Columbia University March 31, 2009.
Breakout session OAI The future of scholarly communication: Enhanced Publications Saskia Woutersen University of Amsterdam.
Semantic Clipboard User Interface is integrated in the Browser Architecture of the Semantic Clipboard Illustration of a license incompliant content reuse.
Scalable Hybrid Keyword Search on Distributed Database Jungkee Kim Florida State University Community Grids Laboratory, Indiana University Workshop on.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
THE SEMANTIC WEB By Conrad Williams. Contents  What is the Semantic Web?  Technologies  XML  RDF  OWL  Implementations  Social Networking  Scholarly.
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
An Ontological Approach to Financial Analysis and Monitoring.
Introduction to the Semantic Web Jeff Heflin Lehigh University.
NeOn Components for Ontology Sharing and Reuse Mathieu d’Aquin (and the NeOn Consortium) KMi, the Open Univeristy, UK
Metadata Driven Aspect Specification Ricardo Ferreira, Ricardo Raminhos Uninova, Portugal Ana Moreira Universidade Nova de Lisboa, Portugal 7th International.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Sensor Data Search & Integration Philippe Cudré-Mauroux & Karl Aberer Nokia-MICS meeting Novembre 14, 2006.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
Semantic Web. P2 Introduction Information management facilities not keeping pace with the capacity of our information storage. –Information Overload –haphazardly.
Representing and Reasoning with Heterogeneous, Modular and Distributed ontologies UniTN/IRST contribution to KnowledgeWeb.WP 2.1.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
WP5: Semantic Multimedia
Lower Adirondack GIS Users Group Meeting March 2, 2005
Martin Rajman, EPFL Switzerland & Martin Vesely, CERN Switzerland
Session 2: Metadata and Catalogues
BUILDING A DIGITAL REPOSITORY FOR LEARNING RESOURCES
CSE591: Data Mining by H. Liu
Presentation transcript:

©2004, Philippe Cudré-Mauroux Exploiting Localized Metadata in Decentralized Settings Microsoft Research Asia Philippe Cudré-Mauroux Distributed Information Systems Laboratory (LSIR) Swiss Federal Institute of Technology, Lausanne (EPFL)

©2004, Philippe Cudré-Mauroux Outline I.Problem definition –Goal –Hurdles –Proposed Solution II.Sharing annotated pictures –Structured metadata standards –Architecture –Demo III.System dynamics IV.Conclusions

©2004, Philippe Cudré-Mauroux I. Problem Definition Goal: exploiting local (structured) metadata to organize information (e.g., pictures) globally Challenges: existing systems do not directly aggregate localized metadata because of three major hurdles: –Local ontologies –Dangling links –Metadata scarceness

©2004, Philippe Cudré-Mauroux Hurdles <rdf:RDF xmlns=" xmlns:dc=" xmlns:rdf=" Compilers in the Key of C A lovely classical work #yoyoAgent … Local Ontology Metadata Scarceness Dangling Link

©2004, Philippe Cudré-Mauroux Local Ontologies State-of-the-art annotation software / standards provide metadata w.r.t. ontologies (-- schemas, - - taxonomies…) Profusion of distinct ontologies –even for specific pieces of information (e.g., images) Ontologies are almost always extendable => Semantic heterogeneity –A single ontology cannot be used to retrieve all relevant individuals –Cf. Peer Data Management

©2004, Philippe Cudré-Mauroux Dangling Links Local metadata often refer to local individuals Such references are irrelevant globally Party date DJ My Cousin John Joe name

©2004, Philippe Cudré-Mauroux Metadata Scarceness Today, most software include some semi- automatic annotation facilities However, most metadata still require human attention –Scarcest resource => Metadata Scarceness

©2004, Philippe Cudré-Mauroux Proposed Solution (high-level view) Local Ontologies –Alignment of ontologies (in a scalable manner) –Semantic Gossiping (query expansion) Dangling Links –Metadata contextualization (scoping) –Alignment of individuals Metadata scarceness –Clustering of individuals (similarity measure) –Propagation of metadata

©2004, Philippe Cudré-Mauroux II. Sharing Annotated Pictures Problem: –Wide adoption of digital cameras => profusion of digital pictures Several GBs of personal pictures is nowadays the norm –How can we share these pictures in a meaningful way, i.e., such that one can find the pictures he/she is looking for? One possible avenue: –Leveraging on the new structured metadata tools / standards Cf. aforementioned hurdles Pioneer work

©2004, Philippe Cudré-Mauroux Providing Structured Annotations to Pictures Several emergent tools / standards providing –Structured metadata (XML, Photoshop Album,) –Ontological metadata (RDF, Adobe XMP) –Type-based metadata (Microsoft WinFS) The bottom-line: model theory, description logic –A terminology –Some assertions

©2004, Philippe Cudré-Mauroux Structured Metadata Ex.: Photoshop Album Hierarchy of tags Stored in a relational, proprietary, local database Non-exportable

©2004, Philippe Cudré-Mauroux Ontological Metadata (1) Ex.: Extensible Metadata Platform (XMP) Subset of RDF/S Metadata might be embedded into the file Supported by a wide range of Adobe applications –Adobe® Acrobat® –Adobe FrameMaker® –Adobe GoLive® –Adobe Illustrator® –Adobe InCopy® –Adobe InDesign® –Adobe LiveMotion™ –Adobe Photoshop® –Adobe Document Server –Adobe Graphics Server –Version Cue™

©2004, Philippe Cudré-Mauroux Ontological Metadata (2) Ex.: Photoshop XMP schema

©2004, Philippe Cudré-Mauroux Type-Based Metadata (1) New file-system for Longhorn (NTFS +++ ) No more hierarchies (i.e., folders) but metadata Items – Attributes – Relationships – Schemas – Sub-Schemas (extensions) –Déjà vu?

©2004, Philippe Cudré-Mauroux Type-Based Metadata (2) Ex.: image schema in WinFS

©2004, Philippe Cudré-Mauroux Comparison of three standards Local Ontologies ExtensibilityDangling Links Local Database Metadata Embedding WinFS  XMP PSA  

©2004, Philippe Cudré-Mauroux PicShare: A Middleware for sharing pictures PicShare PSP XMP WinFS Metadata Extractor (Distributed) Hashtable Insert Retrieve Features Handler 60 moments Information Tracker Feet component view:

©2004, Philippe Cudré-Mauroux Extracting Metadata Local process One extractor per format Extraction of individuals (scoping) –First step to handle dangling links Aggregation of different metadata types –One TBOX per schema / peer Terminological axioms –One ABOX per image Assertions about individuals

©2004, Philippe Cudré-Mauroux Finding Correspondences Finding Mapping Candidates 3 levels: Many different heuristics Cf. Previous presentation Large-scale application Local methods only! Feature Space Extensional Space Ontological Space A B

©2004, Philippe Cudré-Mauroux Local Ontologies Classical ontology alignment problem –For now: Edit distance on property names (T) Comparison of extensions (E) User Feedback (U) Semantic gossiping (query expansion) –When searching for a property value, propagate the search to similar predicates –Also, propagate to subproperties Keeping track of ontology mappings (Kullback-Leibler distance)

©2004, Philippe Cudré-Mauroux Dangling Links Extract individuals along with metadata –Metadata scoping Individual alignment –Based on their structural correspondences (T,S) –Semantic Gossiping (?) When searching for an individual (value), propagate the search to similar individuals

©2004, Philippe Cudré-Mauroux Metadata Scarceness Keep track of metadata scarceness –Entropy of an image: Propagate metadata –For similar images Low-level similarity (S) Structural similarity (S,T) User-based similarity (U)

©2004, Philippe Cudré-Mauroux Demo

©2004, Philippe Cudré-Mauroux Some Shortcomings Need for another encoding of the feature vectors –Now: matching is O (n) –Need for a low dimensionality FV encoding –Prefix-routing compliant Size of hashtable increases linearly with the number of pictures –Tradeoff: metadata propagation VS clustering / matching Some images / schemas stay isolated –Different w.r.t. feature vectors –No (few) metadata

©2004, Philippe Cudré-Mauroux III. System Dynamics (ongoing) Self-organizing system modelization: –States: all ontology / individual alignments –Attractors: similarities (T,S,E,U) –Noise-driven variations: user interactions Analysis of overall entropy –Entropy diminishes with user interactions –Fosters semantic interoperability Choose images with high entropy first! Query propagation (Kullback-Leibler) –Semantic networks traversal –Worst-case scenario (entropic gain) –High-level view: Graph-theoretic problem (cf. cudre04)

©2004, Philippe Cudré-Mauroux IV. Conclusions Contribution: –Problem definition –Initial heuristics –Prototype –A possible analytical model What’s next –Evaluation Any Input ??? Comparison with other approaches? –Paper writing Future works –Improved heuristics –Full analytical study Incl. Objectivity VS Subjectivity of the peers

©2004, Philippe Cudré-Mauroux Exploiting Localized Metadata in Decentralized Settings Microsoft Research Asia Philippe Cudré-Mauroux Distributed Information Systems Laboratory (LSIR) Swiss Federal Institute of Technology, Lausanne (EPFL)