A Robust Approach to Aligning Heterogeneous Lexical Resources Mohammad Taher Pilehvar Roberto Navigli MultiJEDI ERC
Lexical Resource WordNet BabelNet UBY
Lexical Resource WordNet BabelNet UBY
Lexical Resource WordNet BabelNet UBY
Improved word and concept coverage e.g., named entities, new senses Improved domain coverage Improved multilinguality dozens of new languages Expert-made relations preserved e.g., Hypernymy, meronymy, etc. Why combine resources?
Provides complementary knowledge Applications: Semantic Parsing Shi and Mihalcea, 2005 Semantic Role Labeling Palmer et al., 2010 WSD and entity linking Moro et al., TACL 2014 Why combine resources?
A synset from BabelNet
Why combine resources? A synset from BabelNet for plant (living organism) a case example BabelNet
Why combine resources? A synset from BabelNet for plant (living organism)
Why combine resources?
A synset from BabelNet for plant (living organism)
Why combine resources? A synset from BabelNet for plant (living organism)
Difficulty of resource alignment Fine granularity of lexical resources plant WordNet 4 senses15 senses
How resource alignment works? WordNet plant#n#1 Usually measures the similarity of two concepts WKT: WN:
How resource alignment works? Usually measures the similarity of two concepts And aligns two concepts if their similarity exceeds a certain threshold ? WordNet plant#n#1 WKT: WN:
How resource alignment works? Alignment approaches differ in the way they calculate this similarity ? WordNet plant#n#1 WKT: WN:
Gloss similarity WordNet Denfinitional similarity
Strong baseline Fall short when Different wordings are used for same concepts When two words lack quality glosses key -- Metal device shaped in such a way that when it is inserted into the appropriate lock the lock's mechanism can be rotated Key -- An object designed to open and close a lock Denfinitional similarity
Strong baseline Fall short when Different wordings are used for same concepts When two words lack quality glosses plant -- Buildings for carrying on industrial labor. plant -- The necessary infrastructure used in support and maintenance of a given facility. Denfinitional similarity
Contributions A novel concept similarity measure Denfinitional similarity A robust technique for alignment of resources A robust technique for alignment of heterogeneous resources An effective ontologization approach
Our approach: SemAlign
Definition similarity
Our approach: SemAlign Structural similarity
SemAlign: structural similarity WordNet
SemAlign: core Modeling concepts through Semantic Signatures
SemAlign: core Semantic Signatures of concepts
some Personalized PageRank Semantic Signature of a concept
Distributional representation over all concepts in the semantic network Personalized PageRank Semantic Signature of a concept
Distributional representation over all concepts in the semantic network...
Semantic Signature Example WordNet concept for Importance of concept_3 for our concept...
Semantic Signature Importance of concept_3 for our concept... cooking#1 frying_pan#1 oil#4 kitchen#3 Example WordNet concept for -----
Semantic Signature Importance of concept_3 for our concept... table#3 carpet#2 sugar#2 natural_gas#2 Example WordNet concept for -----
Semantic Signature Importance of concept_3 for our concept... read#3 bank#2 Example WordNet concept for -----
SemAlign Definition similarity WordNet
SemAlign WordNet
SemAlign: signature unification
WordNet SemAlign: signature unification Find concepts associated with monosemous words
WordNet SemAlign: signature unification Truncate vectors to the overlapping concepts
SemAlign: signature unification WordNet Synsets containing at least one monosemous word ~ 60% (72,000) The reliability of leveraging monosemous words
Semantic Signature Comparison
Weighted Overlap (Pilehvar et al., ACL 2013) Semantic Signature Comparison
Comparing Semantic Signatures Weighted Overlap
SemAlign: score combination
Ontologization of lexical resources
WordNet Ontologization of lexical resources
WordNet Ontologization of lexical resources
WordNet Ontologization of lexical resources
WordNet Ontologization of lexical resources
Definition page for sail
WordNet Ontologization of lexical resources
WordNet Ontologization of lexical resources
WordNet Ontologization of lexical resources
Definition page for windmill 1.A machine which translates linear motion of wind to rotational motion by means of adjustable vanes called sails. Ontologization of lexical resources
Definition page for windmill 1.A machine which translates linear motion of wind to rotational motion by means of adjustable vanes called sails. windmill.n.1 linear motion.n.1 rotational motion.n.1 translate.v.3 wind.n.1 machine.n.1 sail.n.3 adjustable vanes.v.4 Ontologization of lexical resources
Ontologization : similarity-based disambiguation Definition page for windmill 1.A machine which translates linear motion of wind to rotational motion by means of adjustable vanes called sails.
Ontologization : similarity-based disambiguation 1.A trip in a boat, especially a sailboat. 2.A sailing vessel; a vessel of any kind; a craft. 3.The blade of a windmill. 4.A tower-like structure found on the dorsal (topside) surface of submarines. 1.A machine which translates linear motion of wind to rotational motion by means of adjustable vanes called sails. Definition page for sail Definition page for windmill
Ontologization : similarity-based disambiguation 1.A trip in a boat, especially a sailboat. 2.A sailing vessel; a vessel of any kind; a craft. 3.The blade of a windmill. 4.A tower-like structure found on the dorsal (topside) surface of submarines. 1.A machine which translates linear motion of wind to rotational motion by means of adjustable vanes called sails. Definition page for sail Definition page for windmill
Ontologization : similarity-based disambiguation 1.A trip in a boat, especially a sailboat. 2.A sailing vessel; a vessel of any kind; a craft. 3.The blade of a windmill. 4.A tower-like structure found on the dorsal (topside) surface of submarines. 1.A machine which translates linear motion of wind to rotational motion by means of adjustable vanes called sails. Definition page for sail Definition page for windmill
Ontologization : similarity-based disambiguation 1.A trip in a boat, especially a sailboat. 2.A sailing vessel; a vessel of any kind; a craft. 3.The blade of a windmill. 4.A tower-like structure found on the dorsal (topside) surface of submarines. 1.A machine which translates linear motion of wind to rotational motion by means of adjustable vanes called sails. Definition page for sail Definition page for windmill
Ontologization: evaluation (Meyer & Gurevych, 2012)
Alignment Experiments: Datasets (Matuschek & Gurevych, TACL 2013) WordNet WN synsets manually mapped to their corresponding concepts
Alignment Experiments: System configurations Parameter 1: score combination parameter
Alignment Experiments: System configurations Parameter 2: similarity threshold
Alignment Experiments: System configurations Setting and a. Unsupervised system
Alignment Experiments: System configurations 01 a. Unsupervised system Setting and b. Tuning on subset
Alignment Experiments: System configurations a. Unsupervised system Setting and b. Tuning on subset c. Cross validation
Comparison system Dijkstra-WSA (Matuschek & Gurevych, TACL 2013)
Alignment Experiments SB SB+DWSA F1
Alignment Experiments F1 SB SB+DWSA
SemAlign: structural similarity
Alignment Experiments F1
Conclusions A novel approach for aligning lexical resources –Accurate even in the absence of training data –Robust across different resources An effective ontologization approach Experiments on aligning –WN to WP, WT, and OW
Future directions Integrating the approach into BabelNet for boosting the alignment accuracy Alignment across different languages Updating lexicons with novel terms
1. thanks -- an acknowledgment of appreciation 2. thanks -- with the help of or owing to; 1.thanks -- an expression of gratitude. 2. thanks -- because of; normally used with a positive connotation, though it can be used sarcastically. WordNet Thanks!