Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent Database Systems Lab Presenter : YAN-SHOU SIE Authors Mohamed Ali Hadj Taieb *, Mohamed Ben Aouicha, Abdelmajid Ben Hamadou 2013. KBS Computing.

Similar presentations


Presentation on theme: "Intelligent Database Systems Lab Presenter : YAN-SHOU SIE Authors Mohamed Ali Hadj Taieb *, Mohamed Ben Aouicha, Abdelmajid Ben Hamadou 2013. KBS Computing."— Presentation transcript:

1 Intelligent Database Systems Lab Presenter : YAN-SHOU SIE Authors Mohamed Ali Hadj Taieb *, Mohamed Ben Aouicha, Abdelmajid Ben Hamadou 2013. KBS Computing semantic relatedness using Wikipedia features

2 Intelligent Database Systems Lab Outlines Motivation Objectives Methodology Experiments Conclusions Comments

3 Intelligent Database Systems Lab Motivation Measuring semantic relatedness is a critical task in many domains such as psychology, biology, linguis- tics, cognitive science and artificial intelligence.

4 Intelligent Database Systems Lab Objectives We propose a novel system for computing semantic relatedness between words. Recent approaches have exploited Wikipedia as a huge semantic resource that showed good performances.

5 Intelligent Database Systems Lab Methodology Our semantic relatedness computing system – Filtering Wikipedia category graph – pre-processing Filtering article content Porter stemming Weighting article stems Providing a Category Semantic Depiction (CSD)

6 Intelligent Database Systems Lab Different steps performed to generate the Category Semantic DepictionFiltering Wikipedia category graph Methodology

7 Intelligent Database Systems Lab Methodology Filtering Wikipedia category graph – First : clean meta-categories » We remove all those nodes whose labels contain any of the following strings : Wikipedia, wikiproject, lists, mediawiki,template, user, portal, categories, articles, pages, stub and album – Second : remove orphan nodes and we keep only the category Contents as root » maximum depth 291 to 221

8 Intelligent Database Systems Lab pre-processing – Filtering article content » Remove html tags,infobox, language translation, hyperlinks... – Porter stemming » filtered a stop list to eliminate words which do not have any contribution. – Weighting article stems – Providing a Category Semantic Depiction (CSD) Methodology

9 Intelligent Database Systems Lab Semantic relatedness computing system architecture – Extraction categories algorithm WordNet: resolve the disambiguation pages problem: – Setp1 : extracting all outLinks – Setp2 : find links containing disambiguation tag in parenthesis – Setp3 : extract categories to the two first links – Final : take the categories of the article assigned to the first link existing in the ordered set Methodology-

10 Intelligent Database Systems Lab Methodology Semantic relatedness computing system architecture – Semantic relatedness computing

11 Intelligent Database Systems Lab Methodology Evaluating semantic relatedness measures  Comparison with human judgments  Pearson product-moment correlation coefficient  Spearman rank order correlation coefficient  Datasets

12 Intelligent Database Systems Lab Experiments Our semantic relatedness computing system modules using Wikipedia features – Basic system – First module – Second module – Third module – Forth module

13 Intelligent Database Systems Lab Experiments Basic system

14 Intelligent Database Systems Lab Experiments First module: simple patterns

15 Intelligent Database Systems Lab Experiments Second module: Wikipedia pages

16 Intelligent Database Systems Lab Experiments Third module: enrichment using categories neighbors in WCG

17 Intelligent Database Systems Lab Experiments Forth module: Categories enrichment using WCG and redirects

18 Intelligent Database Systems Lab Experiments Application of the SR measure on other datasets – Datasets RG-65 and MC-30 – The verbal dataset YP-130 Solving word choice problems

19 Intelligent Database Systems Lab Conclusions Our result system shows a good performance and outperforms sometimes ESA (Explicit Semantic Analysis) and TSA (Temporal Semantic Analysis) approaches

20 Intelligent Database Systems Lab Comments Advantages Able to use wiki to get a lot of semantic relationship information, semantic relations for many measurements related work of great help. Applications – cognitive science – artificial intelligence


Download ppt "Intelligent Database Systems Lab Presenter : YAN-SHOU SIE Authors Mohamed Ali Hadj Taieb *, Mohamed Ben Aouicha, Abdelmajid Ben Hamadou 2013. KBS Computing."

Similar presentations


Ads by Google