Presentation is loading. Please wait.

Presentation is loading. Please wait.

©2004, Philippe Cudré-Mauroux Exploiting Localized Metadata in Decentralized Settings Microsoft Research Asia 09.29.04 Philippe Cudré-Mauroux Distributed.

Similar presentations


Presentation on theme: "©2004, Philippe Cudré-Mauroux Exploiting Localized Metadata in Decentralized Settings Microsoft Research Asia 09.29.04 Philippe Cudré-Mauroux Distributed."— Presentation transcript:

1 ©2004, Philippe Cudré-Mauroux Exploiting Localized Metadata in Decentralized Settings Microsoft Research Asia 09.29.04 Philippe Cudré-Mauroux Distributed Information Systems Laboratory (LSIR) Swiss Federal Institute of Technology, Lausanne (EPFL)

2 ©2004, Philippe Cudré-Mauroux Outline I.Problem definition –Goal –Hurdles –Proposed Solution II.Sharing annotated pictures –Structured metadata standards –Architecture –Demo III.System dynamics IV.Conclusions

3 ©2004, Philippe Cudré-Mauroux I. Problem Definition Goal: exploiting local (structured) metadata to organize information (e.g., pictures) globally Challenges: existing systems do not directly aggregate localized metadata because of three major hurdles: –Local ontologies –Dangling links –Metadata scarceness

4 ©2004, Philippe Cudré-Mauroux Hurdles <rdf:RDF xmlns="http://web.resource.org/cc/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> Compilers in the Key of C A lovely classical work #yoyoAgent … Local Ontology Metadata Scarceness Dangling Link

5 ©2004, Philippe Cudré-Mauroux Local Ontologies State-of-the-art annotation software / standards provide metadata w.r.t. ontologies (-- schemas, - - taxonomies…) Profusion of distinct ontologies –even for specific pieces of information (e.g., images) Ontologies are almost always extendable => Semantic heterogeneity –A single ontology cannot be used to retrieve all relevant individuals –Cf. Peer Data Management

6 ©2004, Philippe Cudré-Mauroux Dangling Links Local metadata often refer to local individuals Such references are irrelevant globally Party 04 10.05.04 date DJ My Cousin John Joe name

7 ©2004, Philippe Cudré-Mauroux Metadata Scarceness Today, most software include some semi- automatic annotation facilities However, most metadata still require human attention –Scarcest resource => Metadata Scarceness

8 ©2004, Philippe Cudré-Mauroux Proposed Solution (high-level view) Local Ontologies –Alignment of ontologies (in a scalable manner) –Semantic Gossiping (query expansion) Dangling Links –Metadata contextualization (scoping) –Alignment of individuals Metadata scarceness –Clustering of individuals (similarity measure) –Propagation of metadata

9 ©2004, Philippe Cudré-Mauroux II. Sharing Annotated Pictures Problem: –Wide adoption of digital cameras => profusion of digital pictures Several GBs of personal pictures is nowadays the norm –How can we share these pictures in a meaningful way, i.e., such that one can find the pictures he/she is looking for? One possible avenue: –Leveraging on the new structured metadata tools / standards Cf. aforementioned hurdles Pioneer work

10 ©2004, Philippe Cudré-Mauroux Providing Structured Annotations to Pictures Several emergent tools / standards providing –Structured metadata (XML, Photoshop Album,) –Ontological metadata (RDF, Adobe XMP) –Type-based metadata (Microsoft WinFS) The bottom-line: model theory, description logic –A terminology –Some assertions

11 ©2004, Philippe Cudré-Mauroux Structured Metadata Ex.: Photoshop Album Hierarchy of tags Stored in a relational, proprietary, local database Non-exportable

12 ©2004, Philippe Cudré-Mauroux Ontological Metadata (1) Ex.: Extensible Metadata Platform (XMP) Subset of RDF/S Metadata might be embedded into the file Supported by a wide range of Adobe applications –Adobe® Acrobat® –Adobe FrameMaker® –Adobe GoLive® –Adobe Illustrator® –Adobe InCopy® –Adobe InDesign® –Adobe LiveMotion™ –Adobe Photoshop® –Adobe Document Server –Adobe Graphics Server –Version Cue™

13 ©2004, Philippe Cudré-Mauroux Ontological Metadata (2) Ex.: Photoshop XMP schema

14 ©2004, Philippe Cudré-Mauroux Type-Based Metadata (1) New file-system for Longhorn (NTFS +++ ) No more hierarchies (i.e., folders) but metadata Items – Attributes – Relationships – Schemas – Sub-Schemas (extensions) –Déjà vu?

15 ©2004, Philippe Cudré-Mauroux Type-Based Metadata (2) Ex.: image schema in WinFS

16 ©2004, Philippe Cudré-Mauroux Comparison of three standards Local Ontologies ExtensibilityDangling Links Local Database Metadata Embedding WinFS  XMP PSA  

17 ©2004, Philippe Cudré-Mauroux PicShare: A Middleware for sharing pictures PicShare PSP XMP WinFS Metadata Extractor (Distributed) Hashtable Insert Retrieve Features Handler 60 moments Information Tracker 10000 Feet component view:

18 ©2004, Philippe Cudré-Mauroux Extracting Metadata Local process One extractor per format Extraction of individuals (scoping) –First step to handle dangling links Aggregation of different metadata types –One TBOX per schema / peer Terminological axioms –One ABOX per image Assertions about individuals

19 ©2004, Philippe Cudré-Mauroux Finding Correspondences Finding Mapping Candidates 3 levels: Many different heuristics Cf. Previous presentation Large-scale application Local methods only! Feature Space Extensional Space Ontological Space A B

20 ©2004, Philippe Cudré-Mauroux Local Ontologies Classical ontology alignment problem –For now: Edit distance on property names (T) Comparison of extensions (E) User Feedback (U) Semantic gossiping (query expansion) –When searching for a property value, propagate the search to similar predicates –Also, propagate to subproperties Keeping track of ontology mappings (Kullback-Leibler distance)

21 ©2004, Philippe Cudré-Mauroux Dangling Links Extract individuals along with metadata –Metadata scoping Individual alignment –Based on their structural correspondences (T,S) –Semantic Gossiping (?) When searching for an individual (value), propagate the search to similar individuals

22 ©2004, Philippe Cudré-Mauroux Metadata Scarceness Keep track of metadata scarceness –Entropy of an image: Propagate metadata –For similar images Low-level similarity (S) Structural similarity (S,T) User-based similarity (U)

23 ©2004, Philippe Cudré-Mauroux Demo

24 ©2004, Philippe Cudré-Mauroux Some Shortcomings Need for another encoding of the feature vectors –Now: matching is O (n) –Need for a low dimensionality FV encoding –Prefix-routing compliant Size of hashtable increases linearly with the number of pictures –Tradeoff: metadata propagation VS clustering / matching Some images / schemas stay isolated –Different w.r.t. feature vectors –No (few) metadata

25 ©2004, Philippe Cudré-Mauroux III. System Dynamics (ongoing) Self-organizing system modelization: –States: all ontology / individual alignments –Attractors: similarities (T,S,E,U) –Noise-driven variations: user interactions Analysis of overall entropy –Entropy diminishes with user interactions –Fosters semantic interoperability Choose images with high entropy first! Query propagation (Kullback-Leibler) –Semantic networks traversal –Worst-case scenario (entropic gain) –High-level view: Graph-theoretic problem (cf. cudre04)

26 ©2004, Philippe Cudré-Mauroux IV. Conclusions Contribution: –Problem definition –Initial heuristics –Prototype –A possible analytical model What’s next –Evaluation Any Input ??? Comparison with other approaches? –Paper writing Future works –Improved heuristics –Full analytical study Incl. Objectivity VS Subjectivity of the peers

27 ©2004, Philippe Cudré-Mauroux Exploiting Localized Metadata in Decentralized Settings Microsoft Research Asia 09.29.04 Philippe Cudré-Mauroux Distributed Information Systems Laboratory (LSIR) Swiss Federal Institute of Technology, Lausanne (EPFL)


Download ppt "©2004, Philippe Cudré-Mauroux Exploiting Localized Metadata in Decentralized Settings Microsoft Research Asia 09.29.04 Philippe Cudré-Mauroux Distributed."

Similar presentations


Ads by Google