Presentation is loading. Please wait.

Presentation is loading. Please wait.

Semantically Enriching Folksonomies with

Similar presentations


Presentation on theme: "Semantically Enriching Folksonomies with"— Presentation transcript:

1 Semantically Enriching Folksonomies with
Sofia Angeletou, Marta Sabou and Enrico Motta

2 Semantic Web2.0 “The combination of Semantic Web formal structures and Web2.0 user generated content can lead the Web to its full potential”. Semantically Enriching Folksonomies with FLOR

3 Web2.0 … easy upload free tagging
requiring minimal annotation effort open, dynamic and evolving vocabulary .. leading to a content intensive web …however.. Semantically Enriching Folksonomies with FLOR

4 tagging systems’ characteristics
content retrieval mechanisms are limited: keyword based search tag cloud navigation search may suffer of poor precision and recall due to: basic level variation problem whale VS orca syntactic inconsistencies singular VS plural concatenated/misspelled tags Semantically Enriching Folksonomies with FLOR

5 ..an example query: “animal live water”
looking in for photos of “animals which live in the water” Dog Bird Tiger Cat Land scape 5/24 ≈ 21% relevant This is the first page of results from Flickr. These results are the most interesting as opposed to the most recent (which means that it is possible to have higher relevance to the query) Semantically Enriching Folksonomies with FLOR

6 .. some missed photos whale dolphin dolphin whale dolphin whale
sea elephant seal whale Semantically Enriching Folksonomies with FLOR

7 modifying the query.. similar results ...also:
“animal habitat water” “animal sea” “animal water” similar results ...also: not easy for the user to form the most effective query Semantically Enriching Folksonomies with FLOR

8 kitten furry pets cow whiskers whale eye
our goal Improve content retrieval in folksonomies enhance precision and recall in search enable complex queries support intelligent navigation by applying a semantic layer on top of folksonomy tagspaces Dolphin Seal Marine Mammal Sea hasHabitat Whale Body of Water Ocean Mammal Terrestrial Mammal Tiger Lion Sea Elephant Animal kitten furry pets cow whiskers whale eye cat cute feline pet monkey water deer primate bear lion rodent giraffe dog elephant fur ocean rabbit grass cute tree goat canon tiger seal gorilla brown marine wild closeup california white cats eyes park animals otter mammal animal zoo nature dolphin nose Semantically Enriching Folksonomies with FLOR

9 STEP1: Semantically Enriching Folksonomies
our goal STEP1: Semantically Enriching Folksonomies Dolphin Seal Marine Mammal Sea hasHabitat Whale Body of Water Ocean Mammal Terrestrial Mammal Tiger Lion Sea Elephant Animal kitten furry pets cow whiskers whale eye cat cute feline pet monkey water deer primate bear lion rodent giraffe dog elephant fur ocean rabbit sea grass cute tree goat canon tiger seal gorilla brown marine wild closeup california white cats eyes park animals otter blue mammal animal zoo nature dolphin nose farm hasHabitat Semantically Enriching Folksonomies with FLOR

10 STEP2: Querying Folksonomies through the Semantic Layer
our goal STEP2: Querying Folksonomies through the Semantic Layer Query Mechanism Dolphin Seal Marine Mammal Sea hasHabitat Whale Body of Water Ocean Mammal Terrestrial Mammal Tiger Lion Sea Elephant Animal kitten furry pets cow whiskers whale eye cat cute feline pet monkey water deer primate bear lion rodent giraffe dog elephant fur ocean rabbit sea grass cute tree goat canon tiger seal gorilla brown marine wild closeup california white cats eyes park animals otter blue mammal animal zoo nature dolphin nose farm Semantically Enriching Folksonomies with FLOR

11 “Dolphin OR Seal OR Sea Elephant OR Whale”
21/24 ≈ 87% relevant Semantically Enriching Folksonomies with FLOR

12 existing work on folksonomy enrichment
tag clustering based on co-occurrence frequency, to identify groups of related tags works well in certain contexts, but does not bring ‘explicit semantics’ into the system co-occurrence has no formal meaning (still not able to address the problem of “animal living in water”) existing semantic approaches limited in their semantic coverage some use a thesaurus others use a pre-defined ontology some cases require human intervention domain specific Semantically Enriching Folksonomies with FLOR

13 our approach automatic semantic enrichment of tagspaces
exploiting the entire Semantic Web as well as other sources of background knowledge domain independent enrichment includes the semantic neighbourhood of a concept found in an ontology Including the semantic neighbourhood of a concept (found in an ontology) (as opposed to only linking the concept with a concept) Semantically Enriching Folksonomies with FLOR

14 FLOR Input Lexical Processing Semantic Expansion Semantic Enrichment
Output Online Ontologies Dictionary Thesauri Entity Discovery Tagset Isolated Tags Sense Definition Sem. Enriched Tagset Sem. Expanded Tagset Entity Selection Lexical Isolation Normalised Tagset FLOR is modular, composed of three phases allowing to alter each phase individually (as long as each phase accepts and produces the predefined input and out put, described in the next slide) Semantic Expansion Relation Discovery Lexical Normalisation Semantically Enriching Folksonomies with FLOR

15 1.1.Lexical Isolation isolate tags that can’t be processed by the next steps of FLOR special characters “:P”, “(raw -> jpg)” non English “sillon”, “arbol” numbers “356days”, “tag1” Lexical Processing Dictionary Isolated Tags As mentioned in the previous slide, FLOR components can be altered, enhanced to deal with more types of tags. The lexical processing phase is isolating tags that in each FLOR run cannot be tackled by next phases. For example the tag “:P” can be found neither in WN nor in SW. The same happens for the rest of the tags with special characters, thus we isolate them with this phase. Tagset Lexical Isolation Normalised Tagset Lexical Normalisation Semantically Enriching Folksonomies with FLOR

16 1.2.Lexical Normalisation
enhance anchoring Folksonomies: santabarbara Semantic Web: Santa-Barbara or Santa+Barbara WordNet: Santa Barbara Produce the following: {santaBarbara santa.barbara, santa_barbara, santa(space)barbara, santa-barbara, santa+barbara, ..} Lexical Processing Dictionary Isolated Tags Tagset Lexical Isolation Normalised Tagset Lexical Normalisation Semantically Enriching Folksonomies with FLOR

17 FLOR methodology Semantically Enriching Folksonomies with FLOR
buildings corporation road england bw neil101 1. Lexical Processing buildings : <buildings, building> corporation : <corporation> road : <road> england : <england> Each step generates the information of the same colour. Semantically Enriching Folksonomies with FLOR

18 2. Sense Definition & Semantic Expansion
Goals: Define appropriate sense for each tag (based on the context) Expand the tag with Synonyms and Hypernyms Semantic Expansion Thesauri Sense Definition Sem. Expanded Tagset Normalised Tagset Semantic Expansion Semantically Enriching Folksonomies with FLOR

19 2.1.Sense Definition Wu & Palmer Conceptual Similarity1
1. Z. Wu and M. Palmer. Verb semantics and lexical selection. In 32nd Annual Meeting of the Association for Computational Linguistics, 1994. Semantically Enriching Folksonomies with FLOR

20 2.1.Sense Definition building corporation road england building artifact construction way road building object entity Wu and Palmer Similarity: 0.666 road Using the Wu and Palmer similarity formula on WordNet calculate the pairwise similarity for all combinations of tags. Building and england don’t connect in WN. Semantically Enriching Folksonomies with FLOR

21 2.1.Sense Definition group social group organization gathering
building corporation road england building corporation group social group organization gathering Wu and Palmer Similarity: 0.363 enterprise building Wu and Palmer similarity is calculated by looking at the path that connects all the possible pairs of senses from the two tags in the hierarchy of wordnet business the occupants of a building; "the entire building complained about the noise“ firm corporation Semantically Enriching Folksonomies with FLOR

22 2.1.Sense Definition Selected Senses
a structure that has a roof and walls and stands more or less permanently in one place; "there was a three-story building on the corner” building a business firm whose articles of incorporation have been approved in some state corporation road We select for building (and the same happens for the rest of the tags) the sense that returned a higher similarity with another tag of the tagset. In case of no similarities of a tag with the others in the tagset then the first sense from WordNet (=most popular) is selected an open way (generally public) for travel or transportation england a division of the United Kingdom Semantically Enriching Folksonomies with FLOR

23 2.2.Semantic Expansion The synonyms and hypernyms from the selected senses are used to expand the tags Synonyms Hypernyms buildings: < <edifice>, < structure, construction, artefact, …> > corporation: < <corp>, < firm, business, concern,..> > road: < <route>, <way, artefact, object,..> > england : < < >, <European_Country, European_Nation, land,..> > Semantically Enriching Folksonomies with FLOR

24 FLOR methodology Semantically Enriching Folksonomies with FLOR
2. Disambiguation & Semantic Expansion buildings corporation road england bw neil101 1. Lexical Processing buildings : <buildings, building> corporation : <corporation> road : <road> england : <england> buildings: < <buildings, building>, <edifice>, < structure construction, artefact, …> > corporation: < <corporation>, <corp>, < firm, business, concern,..> > road: < <road>, <route>, <way, artifact, object,..> > england : < <england>, < >, <European_Country, European_Nation, land,..> > Each step generates the information of the same colour. Semantically Enriching Folksonomies with FLOR

25 3.Semantic Enrichment The final phase, links the tags with Ontological Entities (Semantic Web Entities, SWEs) Class Property Individual Semantic Enrichment Online Ontologies Entity Discovery Sem. Enriched Tagset Sem. Expanded Tagset Entity Selection Relation Discovery Semantically Enriching Folksonomies with FLOR

26 3.1.Entity Discovery Query the Semantic Web with
Identify all entities that contain the tag OR its lexical representations OR its synonyms as localname OR label For each tagset For each tag Do what is described in the slides Semantically Enriching Folksonomies with FLOR

27 3.1.Entity Discovery Watson results: Ontology B Ontology A Ontology C
HumanShelterConstruction Ontology A BuiltStructure Building Railway Pier Bridge Tower PublicConstant FixedStructure Building SpaceInAHOC PartOfAnHSC TwoStoryBuilding OneStoryBuilding ThreeStoryBuilding Ontology C Dashed line represents disjointness The shadowed results are very similar according to the similarity measure explained in the next slide. The BuiltEntity is something I created (wasn’t found on WATSON) to demonstrate the next slide Ontology D Spot Structure Building Building label: Gebäude Semantically Enriching Folksonomies with FLOR

28 3.2.Entity Selection the discovered Semantic Web Entities are compared against Semantically Expanded tags buildings: < <edifice>, < structure, construction, artefact, …> > HumanShelterConstruction Building FixedStructure PublicConstant ThreeStoryBuilding PartOfAnHSC SpaceInAHOC OneStoryBuilding TwoStoryBuilding Ontology B The parents of the entities are checked against the hypernyms and if they match (flexibly) then the entity is linked to the tag Entity B is strongly connected to the tag as there are two parents matching two hypernyms. Semantically Enriching Folksonomies with FLOR

29 FLOR methodology Semantically Enriching Folksonomies with FLOR
2. Disambiguation & Semantic Expansion buildings corporation road england bw neil101 1. Lexical Processing buildings : <buildings, building> corporation : <corporation> road : <road> england : <england> 3. Semantic Enrichment buildings: < <buildings, building>, <edifice>, < structure construction, artefact, …> > corporation: < <corporation>, <corp>, < firm, business, concern,..> > road: < <road>, <route>, <way, artifact, object,..> > england : < <england>, < >, <European_Country, European_Nation, land,..> > Each step generates the information of the same colour. buildings : < <buildings, building>, <edifice>, < structure construction, artefact, …>, <URI1#Building, URI2#Building> > corporation : < <corporation>, <corp>, < firm, business, concern,..>, <URI1#Corporation, URI2#Corp> > road : < <road>, <route>, <way, artefact, object,..>, <URI1#Route> > england : < <england>, <>, <Europ. Country, Europ.Nation, land,..>, <URI1#England, URI2#England> > Tags Lexical Synonyms Hypernyms Semantic Web Entities Representations Semantically Enriching Folksonomies with FLOR

30 preliminary experiments
randomly selected photos tagged with 2819 distinct tags the Lexical Isolation phase removed 59% of the tags, resulting to 1146 distinct tags and 226 photos the isolated tags included: 45 two character tags (e.g., pb, ak) 333 containing numbers (e.g., 356days, tag1) 86 containing special characters (e.g., :P, (raw-> jpg)) 818 non English tags (e.g., sillon, arbol) =24 photos which contained exclusively the isolated tags. Semantically Enriching Folksonomies with FLOR

31 tag based results Tag enrichment = CORRECT Tag enrichment = INCORRECT
if tag was linked to appropriate SWE Tag enrichment = INCORRECT if tag was linked to un-appropriate SWE Tag enrichment = UNDETERMINED If we were not able to determine the correctness of the enrichment Tag NON ENRICHED if tag was not linked to any entity Manual evaluation. Tag enrichment = CORRECT if was linked to appropriate SWE (according to the context of the tag) Tag enrichment = INCORRECT if was linked to un-appropriate SWE (according to the context of the tag) Tag enrichment = UNDETERMINED not able to determine based on the context Tag NON ENRICHED if not linked to any entity Semantically Enriching Folksonomies with FLOR

32 tag based results 93 % enrichment precision 73.4% non enriched tags
selected a random 10% (85 tags) and were able to manually enriched 29, thus: ~70% due to Knowledge Sparseness in Watson or Semantic Web ~30% of the non-enriched tags due to FLOR algorithm issues FLOR failed to enrich 841 tags, i.e., 73.4% of the tags (see Table 1). Because this is a signficant amount of tags, we wished to understand whether the enrichment failed because of FLOR's recall or because most of the tags have no equivalent coverage in online ontologies. Semantically Enriching Folksonomies with FLOR

33 FLOR algorithm issues 24% of non enriched tags defined incorrectly in Phase 2 (i.e., assigned to the wrong sense) e.g., <square> assigned to <geometrical-shape> rather than <geographical-area> 55% of non enriched tags were differently defined in WordNet and in ontologies e.g.,: love WordNet: Love→ Emotion → Feeling → Psychological feature (a strong positive emotion of regard and affection) Semantic Web: Love subClassOf Affection Although both these definitions refer to the same sense, and additionally the superclass Affection belongs to the gloss of Love in WordNet, they were not matched because Affection does not appear as a hypernym of Love. Current work investigates alternative ways of Semantic Expansion. Semantically Enriching Folksonomies with FLOR

34 photo based results Photo enrichment = CORRECT
if all enriched tags CORRECT Photo enrichment = INCORRECT if all enriched tags INCORRECT Photo enrichment = MIXED if some tags INCORRECT and some tags CORRECT Photo enrichment = UNDETERMINED if all enriched tags UNDETERMINED (i.e. could not decide on correctness) Photo NON ENRICHED if none of the tags was enriched Semantically Enriching Folksonomies with FLOR

35 photo based results Semantically Enriching Folksonomies with FLOR

36 future work Semantic Relatedness measure instead of similarity measure
Process the Lexically Isolated tags using other background knowledge resources, e.g. Wikipedia. Relation discovery between tags with Step2: Intelligent Query Interface large scale evaluation Expand the tags with more hypernyms and synonyms (love-affection) case Semantically Enriching Folksonomies with FLOR

37 conclusions automatic semantic enrichment of tagspaces is possible
93% precision in the 24.5% enriched tags 79% enriched resources three phase architecture works well identified the steps of each phase that require improvement Semantically Enriching Folksonomies with FLOR

38 Thank you  S.Angeletou@open.ac.uk
Semantically Enriching Folksonomies with FLOR


Download ppt "Semantically Enriching Folksonomies with"

Similar presentations


Ads by Google