Presentation is loading. Please wait.

Presentation is loading. Please wait.

21.09.06 Krzysztof Janowicz Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links Krzysztof Janowicz;

Similar presentations


Presentation on theme: "21.09.06 Krzysztof Janowicz Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links Krzysztof Janowicz;"— Presentation transcript:

1 21.09.06 Krzysztof Janowicz Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links Krzysztof Janowicz; Muenster Semantic Interoperability Lab (MUSIL)

2 Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 2 Outline Motivation Scenario Annotation Theory Further Work Image from: http://de.wikipedia.org/wiki/HMS_Victory (Bleiglass, 1998)

3 Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 3 Motivation For the cultural heritage community Incomplete and vague knowledge Interchange between external sources is necessary to answer complex scientific questions & to clean up local knowledge Local versus global identifiers  Accessible service-based infrastructure!

4 Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 4 Motivation For semantic similarity research Application of similarity in a real world domain Similarity as part of the identity assumption puzzle Combination of similarity and classical reasoning Using a stable upper-level ontology (CIDOC CRM)  Theory of similarity assumptions for historical places

5 Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 5 Motivation For an identity assumption service To run queries against multiple sources it has to be made sure that they refer to the same real-world phenomena; just a common language is not enough! Non unique place names (even within the same area) Place names refer to cities, rivers, valleys, mountains,… Misinterpreted place names (e.g. 'Al Wahat‘  Oasis) Names also refer to varying geopolitical units (e.g. nomads) or prominent (artificial) landmarks (e.g. telegraph stations) Out-dated place or even country names (e.g. UDSSR)  Gazetteers can only partially solve these problems (From discussions with Dr. Karl-Heinz Lampe; ZFMK)

6 Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 6 Battle of Trafalgar - Scenario Took place at Cape Trafalgar (Province Cadiz) in 1805 British victory under the command of Horatio Nelson HMS Victory was Nelsons flagship Nelson was shot during the battle and died afterwards  Should be easy to annotate!? Spatial relation between naval battleground and terrestrial cape, Province Cadiz,..? Place names: Cabo Trafalgar, Taraf al-Gharb, رأس الطرف الأغر Also in a historical source from French perspective? Image from: http://en.wikipedia.org/wiki/Horatio_Nelson (painted by Nicholas Pocock) Vice-Admiral Horatio Nelson, 1st Viscount Nelson? HMS Victory: Which one?! Temporal relations?

7 Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 7 From: http://en.wikipedia.org/wiki/Image:Trafalgar_aufstellung.jpg

8 Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 8 Annotation of Historical Knowledge CIDOC conceptual reference model (CRM) as upper-level ontology for the cultural heritage domain specifies abstract and interrelated vocabulary instead of concrete definitions such as for kinds of exhibits  heterogeneous domain! describes historical knowledge by relations between places, events, actor and objects RDF(S) based representation ISO Standard (ISO/PRF 21127)

9 Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 9 Annotation Examples (RDF-Triples) P89F.falls_within(E53.Place(Cape Trafalgar), E53.Place(Province Cádiz)) Subject-Predicate-Object: The place Cape Trafalgar falls within a place called Province Cádiz P8F.took_place_at(E7.Activity(Battle of Trafalgar), E53.Place(Cape Trafalgar)) P117F.occurs_during (E7.Activity(Battle of Trafalgar), E5.Event(Trafalgar Campaign)) P14F.carried_out_by (E7.Activity(Battle of Trafalgar), E21.Person(Nelson)) P2F.has_type (E53.Place(Andalusia), E55.Type(regions))

10 Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 10 Theory In practice semi-automatic disambiguation via gazetteers and other global authorities (such as for historical figures) is often difficult, expensive and error-prone (especially for subordinate geopolitical units, events, actors,…) Use the links established via the CIDOC CRM annotation between places, actors, objects and events as additional reference points!

11 Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 11 Theory Geoinformation = Semantic Reference Systems interpretation Spatiotemporal Reference Systems Use thematic information as support for spatiotemporal reference Mike Goodchild: Geographic Rreality CIDOC CRM + Reasoning + Similarity

12 Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 12 Theory: Framework Comparing Place Descriptions 1.Extract new triples out of existing ones  Spatiotemporal & Subsumption Reasoning 2.Compute overlap between source and target triples  Semantic Similarity Measurement 3.Compare remaining labels & identifiers  Syntactic Identifier Matching 4.How probably compared places correspond  Identity Assumption

13 Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 13 Theory: Reasoning Entities are described by sets of RDF triples Inference rules to generate new triples  Make local knowledge explicit!  More comparable information about entities Example: Spatial & temporal Inference rules Be careful - names are ambiguous! HMS XYZ (1804) HMS XYZ (1805) ?

14 Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 14 Theory: Similarity Napoleonic Wars Nelson performed falls within died in Cape Trafalgar Province Cádiz falls within Source: Cape Trafalgar Province Cádiz overlaps with Target: sim p * sim s = Province Cádiz

15 Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 15 Theory: Network Approach to Similarity 1.For all tuples from the source entity: find equal or similar tuples within the target entity description 2.Define meaningful notions of similarity for given predicates (relations) Spatial Temporal Thematic 3.Define meaningful notion of similarity for all objects that are not subjects of other triples themselves (e.g. ADL Feature Types)

16 Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 16 Theory: Neighborhoods & Hierarchies Egenhofer & Al-Taha 1992 Different similarity measures for neighborhoods & hierarchies temporal spatial thematic

17 Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 17 Theory: Syntactic Matching After recursively applying (semantic) similarity measurements, only labels, vague appellations and identifier are left  Requires syntactic matching / measuring (Getty Thesaurus) ID: 7008751 ID: 7008750 Cape Trafalgar Wrexham (found at: www.gwjokes.com )

18 Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 18 Two place descriptions probably refer to the same (real world) place if they are linked via equal or similar relations to equal or similar events, actors, objects, … Similar position within a network of historical facts Stepwise applying new restrictions to the set of compared historical places  Number of compared tuples is a critical issue! Theory: Identity Assumptions

19 Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 19 Further Work & Evidence Similarity is only one part of the puzzle! Other parts: trust, contradictions & consistence,... Which inference rules may lead to difficulties? How to handle complementary knowledge? Connections to Time Map and ECAI Evidence! Battle of Trafalgar Scenario?  Develop a identity assumption pilot  Combination of similarity measurement with itineraries  Based on real world data from ZFMK, Bonn (biodiversity museum)

20 Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 20 Questions Thank You! Special thanks to Martin Doerr Foundation for Research and Technology - Hellas (FORTH) Institute of Computer Science. Heraklion, Crete, Greece Karl-Heinz Lampe Zoologisches Forschungsmuseum Alexander Koenig (ZFMK). Bonn, Germany Any Questions?

21 Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 21 ‘Real World’-Place? From: http://de.wikipedia.org/wiki/Bild:Atlantis_map_kircher.gif

22 Krzysztof Janowicz Similarity-Based Identity Assumption Service for Historical Places 22 Gazetteer Feature Types Andalucía ADLG Getty Thesaurus


Download ppt "21.09.06 Krzysztof Janowicz Towards a Similarity-Based Identity Assumption Service for Historical Places Establishing Meaningful Links Krzysztof Janowicz;"

Similar presentations


Ads by Google