Learning to Map between Ontologies on the Semantic Web AnHai Doan, Jayant Madhavan, Pedro Domingos, and Alon Halevy Databases and Data Mining group University.

Slides:



Advertisements
Similar presentations
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Advertisements

The Semantic Web: What, Why, and How? Ann Wrightson Principal Consultant, alphaXML Ltd
Intelligent Technologies Module: Ontologies and their use in Information Systems Revision lecture Alex Poulovassilis November/December 2009.
The 20th International Conference on Software Engineering and Knowledge Engineering (SEKE2008) Department of Electrical and Computer Engineering
Learning Knowledge Rich User Models from the Semantic Web Gunnar Aastrand Grimnes First Year Talk 14 th May, 2003.
CS570 Artificial Intelligence Semantic Web & Ontology 2
Amit Shvarchenberg and Rafi Sayag. Based on a paper by: Robin Dhamankar, Yoonkyong Lee, AnHai Doan Department of Computer Science University of Illinois,
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Of 27 lecture 7: owl - introduction. of 27 ece 627, winter ‘132 OWL a glimpse OWL – Web Ontology Language describes classes, properties and relations.
Helping people find content … preparing content to be found Enabling the Semantic Web Joseph Busch.
Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships Eduard C. Dragut Ramon Lawrence Eduard C. Dragut Ramon Lawrence.
1 CSIT600f: Introduction to Semantic Web Conclusion and Outlook Dickson K.W. Chiu PhD, SMIEEE Text: Antoniou & van Harmelen: A Semantic Web PrimerA Semantic.
Machine Learning and the Semantic Web
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
Ontologies and the Semantic Web by Ian Horrocks presented by Thomas Packer 1.
1 CIS607, Fall 2004 Semantic Information Integration Presentation by Julian Catchen Week 3 (Oct. 13)
Xyleme A Dynamic Warehouse for XML Data of the Web.
MS DB Proposal Scott Canaan B. Thomas Golisano College of Computing & Information Sciences.
Reconciling Schemas of Disparate Data Sources: A Machine-Learning Approach AnHai Doan Pedro Domingos Alon Halevy.
PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment Natalya Fridman Noy and Mark A. Musen.
1 CIS607, Fall 2005 Semantic Information Integration Presentation by Enrico Viglino Week 3 (Oct. 12)
The Semantic Web Week 1 Module Content + Assessment Lee McCluskey, room 2/07 Department of Computing And Mathematical Sciences Module.
Reconciling Schemas of Disparate Data Sources: A Machine-Learning Approach AnHai Doan Pedro Domingos Alon Halevy.
The Semantic Web Week 12 Term 1 Recap Lee McCluskey, room 2/07 Department of Computing And Mathematical Sciences Module Website:
Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy.
Schema Matching Algorithms Phil Bernstein CSE 590sw February 2003.
PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment Natalya F. Noy and Mark A. Musen.
Towards Semantic Web: An Attribute- Driven Algorithm to Identifying an Ontology Associated with a Given Web Page Dan Su Department of Computer Science.
11/8/20051 Ontology Translation on the Semantic Web D. Dou, D. McDermott, P. Qi Computer Science, Yale University Presented by Z. Chen CIS 607 SII, Week.
Samad Paydar Web Technology Laboratory Computer Engineering Department Ferdowsi University of Mashhad 1389/11/20 An Introduction to the Semantic Web.
Automatic Data Ramon Lawrence University of Manitoba
Learning Table Extraction from Examples Ashwin Tengli, Yiming Yang and Nian Li Ma School of Computer Science Carnegie Mellon University Coling 04.
Evaluating Ontology-Mapping Tools: Requirements and Experience Natalya F. Noy Mark A. Musen Stanford Medical Informatics Stanford University.
Pedro Domingos Joint work with AnHai Doan & Alon Levy Department of Computer Science & Engineering University of Washington Data Integration: A “Killer.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
OMAP: An Implemented Framework for Automatically Aligning OWL Ontologies SWAP, December, 2005 Raphaël Troncy, Umberto Straccia ISTI-CNR
WEB FORUM MINING BASED ON USER SATISFACTION PAGE 1 WEB FORUM MINING BASED ON USER SATISFACTION By: Suresh Pokharel Information and Communications Technologies.
Language Identification of Search Engine Queries Hakan Ceylan Yookyung Kim Department of Computer Science Yahoo! Inc. University of North Texas 2821 Mission.
Knowledge Discovery in Ontology Learning A survey.
Mining the Semantic Web: Requirements for Machine Learning Fabio Ciravegna, Sam Chapman Presented by Steve Hookway 10/20/05.
AnHai Doan, Pedro Domingos, Alon Halevy University of Washington Reconciling Schemas of Disparate Data Sources: A Machine Learning Approach The LSD Project.
Web Service Discovery Mechanisms Looking for a Needle in a Haystack? Evangelos Sakkopoulos joint work with J. Garofalakis, Y. Panagis, A. Tsakalidis University.
AnHai Doan Pedro Domingos Alon Levy Department of Computer Science & Engineering University of Washington Learning Source Descriptions for Data Integration.
Learning Source Mappings Zachary G. Ives University of Pennsylvania CIS 650 – Database & Information Systems October 27, 2008 LSD Slides courtesy AnHai.
Text Feature Extraction. Text Classification Text classification has many applications –Spam detection –Automated tagging of streams of news articles,
Knowledge Representation of Statistic Domain For CBR Application Supervisor : Dr. Aslina Saad Dr. Mashitoh Hashim PM Dr. Nor Hasbiah Ubaidullah.
EasyQuerier: A Keyword Interface in Web Database Integration System Xian Li 1, Weiyi Meng 2, Xiaofeng Meng 1 1 WAMDM Lab, RUC & 2 SUNY Binghamton.
Google’s Deep-Web Crawl By Jayant Madhavan, David Ko, Lucja Kot, Vignesh Ganapathy, Alex Rasmussen, and Alon Halevy August 30, 2008 Speaker : Sahana Chiwane.
Towards Distributed Information Retrieval in the Semantic Web: Query Reformulation Using the Framework Wednesday 14 th of June, 2006.
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Aligner automatiquement des ontologies avec Tuesday 23 rd of January, 2007 Rapha ë l Troncy.
Ontology Mapping in Pervasive Computing Environment C.Y. Kong, C.L. Wang, F.C.M. Lau The University of Hong Kong.
Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #4 Vision for Semantic Web.
Semantically Federating Multi- Agent Organizations R. Cenk ERDUR, Oğuz DİKENELLİ, İnanç SEYLAN, Önder GÜRCAN. AEGEANT-S Group, Ege University, Dept. of.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
CSC 9010 Spring, Paula Matuszek. 1 CS 9010: Semantic Web Applications and Ontology Engineering Paula Matuszek Spring, 2006.
Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.
Semantic Mappings for Data Mediation
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Enhanced hypertext categorization using hyperlinks Soumen Chakrabarti (IBM Almaden) Byron Dom (IBM Almaden) Piotr Indyk (Stanford)
Microsoft Research Faculty Summit Prashant Doshi Asst. Professor of Computer Science University of Georgia.
Of 24 lecture 11: ontology – mediation, merging & aligning.
Warren Shen, Xin Li, AnHai Doan Database & AI Groups University of Illinois, Urbana Constraint-Based Entity Matching.
The Semantic Web By: Maulik Parikh.
AnHai Doan, Pedro Domingos, Alon Halevy University of Washington
Building Trustworthy Semantic Webs
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
ece 720 intelligent web: ontology and beyond
Property consolidation for entity browsing
Integrating Taxonomies
Presentation transcript:

Learning to Map between Ontologies on the Semantic Web AnHai Doan, Jayant Madhavan, Pedro Domingos, and Alon Halevy Databases and Data Mining group University of Washington

Semantic Web Mark-up data on the web using ontologies Enable intelligent information processing over the web Personal software agents Queries over multiple web pages …

An Example Semantic Mappings allow information processing across ontologies James Cook PhD, U Sydney Data Instance Find Prof. Cook, a professor in a Seattle college, earlier an assoc. professor at his alma mater in Australia Semantic Mapping People StaffFaculty ProfessorAssoc. Professor Asst. Professor AcademicTechnical ProfessorSenior Lecturer Staff Name Education ……

Semantic Web: State of the Art Languages for ontologies RDF, DAML+OIL,… Ontology learning and Ontology design tools [Maedche’02], Protégé, Ontolingua,… Semantic Mappings crucial to the SW vision [Uscold’01, Berners-Lee, et al.’01] Without semantic mappings…Tower of Babel !!!

Semantic Mapping Challenges Ontologies can be very different Different vocabularies, different design principles Overlap, but not coincide Semantic Mapping information Data instances marked up with ontologies Concept names and taxonomic structure Constraints on the mapping

Overview People StaffFaculty ProfessorAssoc. Professor Asst. Professor AcademicTechnical ProfessorSenior Lecturer Staff Faculty Define Similarity Sim(Fac,Staff) Sim(Fac,Prof) Sim(Fac,Acad) Compute Similarity Satisfy Constraints Academic ?

Our Contributions An automatic solution to taxonomy matching Handles different similarity notions Exploits information in data instances and taxonomic structure, using multi-strategy learning Extend solution to handle wide variety of constraints, using Relaxation Labeling GLUE An implementation, our GLUE system, and experiments on real-world taxonomies High accuracy (68-98%) on large taxonomies ( concepts)

Defining Similarity Multiple Similarity measures in terms of the JPD Assoc. Prof Snr. Lecturer A,  S  A, S  A,  S A,S P(A,  S) + P(A,S) + P(  A,S) P(A,S) = Joint Probability Distribution: P(A,S),P(  A,S),P(A,  S),P(  A,  S) Hypothetical Common Marked up domain P(A  S) P(A  S) Sim(Assoc. Prof., Snr. Lect.) = [Jaccard, 1908]

No common data instances In practice, not easy to find data tagged with both ontologies ! United StatesAustralia Solution: Use Machine Learning A AA S SS

Machine Learning for computing similarities JPD estimated by counting the sizes of the partitions CL S S SSUnited States Australia A AA S SS CL A A AA A,  SA,S  A,  S  A,S A,S  A,S A,  S  A,  S

Improve Predictive Accuracy – Use Multi-Strategy Learning Single Classifier cannot exploit all available information Combine the prediction of multiple classifiers CL A 1 A AA A AA CL A N A AA … Content Learner Frequencies on different words in the text in the data instances Name Learner Words used in the names of concepts in the taxonomy Others … Meta-Learner

So far… Define Similarity Compute Similarity Satisfy Constraints Joint Probability Distribution Multi-strategy Learning

Constraints due to the taxonomy structure Domain specific constraints Department-Chair can only map to a unique concept Numerous constraints of different types Staff People Next Step: Exploit Constraints StaffFac Prof Assoc. ProfAsst. Prof AcadTech ProfSnr. Lect. Lect. People StaffParents Children Extended Relaxation Labeling to ontology matching

Solution: Relaxation Labeling Find the best label assignment given a set of constraints Start with an initial label assignment Iteratively improves labels, given constraints Standard Relaxation Labeling not applicable Extended in many ways Acad Staff People Staff Fac Prof Assoc. ProfAsst. Prof Tech ProfSnr. Lect. Lect.PeopleProf Assoc. Prof Asst. Prof Fac Prof ?Staff Snr. Lect. Lect.StaffAcad Prof Lect. ?

Distribution Estimator Joint Distributions:P(A,B),P(A,  B),… Taxonomy O 2 (structure + data instances) Taxonomy O 1 (structure + data instances) Putting it all together GLUE System Relaxation Labeler Generic & Domain constraints Mappings for O 1, Mappings for O 2 Similarity Estimator Similarity Matrix Similarity function Distribution Estimator Meta Learner Learner CL 1 Learner CL N

Real World Experiments Taxonomies on the web University classes (UW and Cornell) Companies (Yahoo and The Standard) For each taxonomy Extracted data instances – course descriptions, and company profiles Trivial data cleaning 100 – 300 concepts per taxonomy 3-4 depth of taxonomies average data instances per concept Evaluation against manual mappings as the gold standard

Results University IUniversity IICompanies

Related Work Our LSD schema matching system [Doan, Domingos, Halevy ’01] GLUE handles taxonomies, richer models, and a much richer set of constraints Other Ontology and Schema Matching work [Noy, Musen’01], [Melnik, et al.’02], [Ichise, et al.’01] Mostly heuristics, or single machine learning techniques Relaxation Labeling for constraint satisfaction [Hummel, Zucker’83], [Chakrabarti, et al.’00] Significantly extend this approach

Conclusions & Future Work An automated solution to taxonomy matching Handles multiple notions of similarity Exploits data instances and taxonomy structure Incorporates generic and domain-specific constraints Produces high accuracy results Future Work More expressive models Complex Mappings Automated reasoning about mappings between models

An Example Semantic Mappings allow information processing across ontologies James Cook PhD, U Sydney Data Instance Find Prof. Cook, a professor in a Seattle college, earlier an assoc. professor at his alma mater in Australia Semantic Mapping People StaffFaculty ProfessorAssoc. Professor Asst. Professor AcademicTechnical ProfessorSenior Lecturer Staff Name Education ……

Acad Solution: Relaxation Labeling Iterative estimation of most likely label assignment Staff People Staff Fac Prof Assoc. ProfAsst. Prof Tech ProfSnr. Lect. Lect.StaffProf Snr. Lect. Lect. Tech Prof Lect. Prof Lect.Staff Challenges Making the computation tractable – large number of labels Combining effects of various constraintsPeopleProf Assoc. Prof Asst. Prof Acad ? Fac

People StaffFaculty ProfessorAssoc. Professor Asst. Professor AcademicTechnical ProfessorSenior Lecturer Staff Name Education …… Languages for Ontologies E.g. DAML+OIL Ontology Design Tools E.g. Protégé, Ontolingua,… Semantic Mapping