Baoshi Yan, P2PKM /17/ Grass-Roots Class Alignment Baoshi Yan Information Sciences Institute, University of Southern California
Baoshi Yan, P2PKM /17/ Motivation Sharing Structured Data among peers However, peers might use different terminology (Ontology) Need Ontology Alignment
Baoshi Yan, P2PKM /17/ What is Alignment Correspondence between concepts PhDStudent Firstname Lastname major DoctoralStudent Givenname Familyname specialization … … … …
Baoshi Yan, P2PKM /17/ Alignment: State of the Art Heuristics-based: Name similarity Structure similarity Instance Constraints Co-occurrence Domain Expert Centralized Precise Alignment
Baoshi Yan, P2PKM /17/ Our Approach Cursory Alignment by End Users Easy to produce Combining different user’s alignments Reuse to reduce effort by each user Grass-Roots Alignment Peer-to-Peer Alignment Alignment Corpus
Baoshi Yan, P2PKM /17/ Grass-roots Alignment Example: WebScripter tool Inferred Alignment: iswc:phone = isi: phonenumber Inferred Alignment: iswc:phone = isi: phonenumber when a user puts different stuffs into the same column, they mean same thing Inferred Alignment: iswc:Person = isi: Div2Member Inferred Alignment: iswc:Person = isi: Div2Member
Baoshi Yan, P2PKM /17/ Properties of Grass-Roots Alignment Might be Approximate inconsistent Intransitive Graduate O1 Doctoral Student PhDStudent Graduate Student MSStudent O2 Master Student O3O4
Baoshi Yan, P2PKM /17/ Challenge How to reuse approximate or inconsistent grass-roots alignments for alignment purposes Approximation conservative semantics of alignment Inconsistency evidences
Baoshi Yan, P2PKM /17/ Observations & Assumptions Users tend to pick closest alignment candidate O2 A B CA CB O1 A BC AC B A BC A B C B C A A B C (a)(b) (c) (d) O2
Baoshi Yan, P2PKM /17/ Basic Idea: Class relationships specified in ontology definite Class relationships indicated by previous alignments Indefinite/ambiguous Inference to get more Definite class relationships Use these class relationships for future alignment
Baoshi Yan, P2PKM /17/ Class Alignment Algorithm: Step 1 Subclass Relationships Specified in the Ontology
Baoshi Yan, P2PKM /17/ Class Alignment Algorithm: Step 2 Class Relationships Implied by Grass- roots Alignments: the Semantics of Grass-roots Alignments: A BC A B C A C B OR C B A A BC A BC NOT,, O1O2
Baoshi Yan, P2PKM /17/ the Semantics of Grass-roots Alignments (Cont) A B C A C B NOT O1O2
Baoshi Yan, P2PKM /17/ the Semantics of Grass-roots Alignments (Cont) A BC DA · D B · C O1O2
Baoshi Yan, P2PKM /17/ Class Alignment Algorithm: Step 2 Class Relationships Implied by Alignments
Baoshi Yan, P2PKM /17/ Class Alignment Algorithm: Step 3: Forward-chaining Inference
Baoshi Yan, P2PKM /17/ (f1, e1) AND (f2, e2)... AND (fi, ei) = > (f, e), its evidence e = e1*e2*..*ei. same fact supported by evidences e1, e2,..ei, e = e1+e2+...+ei. Also note that same evidence doesn't count twice, that is, e1 + e1 = e1, e1 * e1 = e1. Quantifying Evidences: V(e): a numerical value between (0, 1). V(e1+e2) = 1-(1-V(e1))*(1-V(e2)) V(e1*e2) = V(e1)*V(e2) Dealing with Evidences
Baoshi Yan, P2PKM /17/ Class Alignment Algorithm Step 4: Class Alignment Using Facts KB Sup(A): the set of superclasses of A Sub(A): the set of subclasses of A Ind(A): all B such that (A > B OR B > A) neither A > B or B > A is in KB I.e., B and A are indistinguishable according to facts KB. deal with inconsistencies: for each B from Sup(A), if there is a better- supported fact A > B, NOT(B > A) or B A, remove B from Sup(A). Do the same to Sub(A).
Baoshi Yan, P2PKM /17/ Examples: Ind(MasterStudent)= {MSStudent} Sup(MasterStudent) ={Graduate,Student, UnivStudent} Sub(Graduate)={Mas terStudent,MSStude nt,DoctoralStudent} Class Alignment Using Facts KB (cont)
Baoshi Yan, P2PKM /17/ Class Alignment Using Facts KB (cont) Given A from O1, find best alignment B in O2 in the following order: O2 ∩ Ind(A) O2 ∩ Sup(A) If B, B1 ∈ O2 ∩ Sup(A), pick B if B1 > B O2 ∩ Sub(A) If B, B1 ∈ O2 ∩ Sub(A), pick B if B > B1 Everything being equal, pick better supported Otherwise no alignment candidate for A in O2.
Baoshi Yan, P2PKM /17/ Class Alignment Using Facts KB (cont) Example: Ind(MasterStudent)={MSStudent} Sup(DoctoralStudent)={Graduate,Student,UnivStudent} Ind(Student)={UnivStudent} Student O1O2 Doctoral Student Master Student UnivStudent Graduate MSStudent
Baoshi Yan, P2PKM /17/ Evaluation (qualitative analysis) In the ideal case: Each previous alignment is best possible Then: Guaranteed Correctness in some cases In the not-so-ideal case: Bad facts likely filtered out Student O1 Doctoral Student UnivStudent Graduate O2 Sup(DoctoralStudent)= {UnivStudent,Graduate}
Baoshi Yan, P2PKM /17/ Evaluation 26 ontologies on university student domain Measure resultant fact KB vs Reference KB
Baoshi Yan, P2PKM /17/ Related Work: schema mediation, schema reconciliation, schema matching, semantic coordination, semantic mapping, and ontology mapping ONION, PROMPT, LSD, GLUE, Automatch, SemInt, CUPID, COMA, MGS-DCM, HSDM Mediator, MOBS… Name similarity, Structure similarity, Domain Constraints, Instance Features, Instance similarity, Multi-strategy learning, Statistical analysis, Alignment reuse. Little work on Peer-to-Peer Alignment
Baoshi Yan, P2PKM /17/ Summary An Alignment Approach: Ontology Alignment carried out by end users in a Peer to Peer fashion Peers are both alignment consumer and producer Future work: Detailed experiments, theoretical analysis Property alignment with class as context Thank You!