Semi-Automatic Generation of Mini-Ontologies from Canonicalized Relational Tables Chris Hathaway.

Slides:



Advertisements
Similar presentations
What is a Database By: Cristian Dubon.
Advertisements

Lukas Blunschi Claudio Jossen Donald Kossmann Magdalini Mori Kurt Stockinger.
Describing Process Specifications and Structured Decisions Systems Analysis and Design, 7e Kendall & Kendall 9 © 2008 Pearson Prentice Hall.
Amit Shvarchenberg and Rafi Sayag. Based on a paper by: Robin Dhamankar, Yoonkyong Lee, AnHai Doan Department of Computer Science University of Illinois,
Copyright Irwin/McGraw-Hill Data Modeling Prepared by Kevin C. Dittman for Systems Analysis & Design Methods 4ed by J. L. Whitten & L. D. Bentley.
Logical Database Design
Technical BI Project Lifecycle
IT420: Database Management and Organization
Chapter 6 Methodology Logical Database Design for the Relational Model Transparencies © Pearson Education Limited 1995, 2005.
Fundamentals, Design, and Implementation, 9/e COS 346 Day 8.
A Tool to Support Ontology Creation Based on Incremental Mini- Ontology Merging Zonghui Lian Data Extraction Research Group Supported by Spring Conference.
Data Frames Version 3 Proposal. Data Frames Version 2 Year matches [2] constant { extract "\d{2}"; context "([^\$\d]|^)\d{2}[^,\dkK]"; } 0.5, { extract.
Fundamentals, Design, and Implementation, 9/e Chapter 5 Database Design.
Thesis Defense Mini-Ontology GeneratOr (MOGO) Mini-Ontology Generation from Canonicalized Tables Stephen Lynn Data Extraction Research Group Department.
Chapter 14 Getting to First Base: Introduction to Database Concepts.
A Tool to Support Ontology Creation Based on Incremental Mini-Ontology Merging Zonghui Lian Data Extraction Research Group Supported by.
Annotating Documents for the Semantic Web Using Data-Extraction Ontologies Dissertation Proposal Yihong Ding.
1 Extracting RDF Data from Unstructured Sources Based on an RDF Target Schema Tim Chartrand Research Supported By NSF.
Extracting Structured Data from Web Page Arvind Arasu, Hector Garcia-Molina ACM SIGMOD 2003.
Introduction to databases from a bioinformatics perspective Misha Taylor.
Fundamentals, Design, and Implementation, 9/e COS 346 Day 2.
1 A Tool to Support Ontology Creation Based on Incremental Mini-ontology Merging Zonghui Lian.
A Tool to Support Ontology Creation based on Incremental Mini- Ontology Merging Zonghui Lian Supported by.
Semi-Automatic Generation of Mini-Ontologies from Canonicalized Relational Tables Chris Hathaway Supported by NSF.
© Pearson Education Limited, Chapter 12 Physical Database Design – Step 3 (Translate Logical Design) Transparencies.
Learning Table Extraction from Examples Ashwin Tengli, Yiming Yang and Nian Li Ma School of Computer Science Carnegie Mellon University Coling 04.
Firat Batmaz, Chris Hinde Computer Science Loughborough University A Diagram Drawing Tool For Semi–Automatic Assessment Of Conceptual Database Diagrams.
Thesis Proposal Mini-Ontology GeneratOr (MOGO) Mini-Ontology Generation from Canonicalized Tables Stephen Lynn Data Extraction Research Group Department.
Visual Basic Chapter 1 Mr. Wangler.
Overview of the Database Development Process
CPSC 203 Introduction to Computers T59 & T64 By Jie (Jeff) Gao.
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
Arc Hydrology Data Model An Overview of the Modeling Process Kim Davis and Tim Whiteaker Center for Research in Water Resources University of Texas at.
ITEC224 Database Programming
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
Programming Project (Last updated: August 31 st /2010) Updates: - All details of project given - Deadline: Part I: September 29 TH 2010 (in class) Part.
Multilingual Information Exchange APAN, Bangkok 27 January 2005
Information organization and taxonomy building An introduction David Rashty, Isaac Waisberg.
Building an Ontology of Semantic Web Techniques Utilizing RDF Schema and OWL 2.0 in Protégé 4.0 Presented by: Naveed Javed Nimat Umar Syed.
DBSQL 3-1 Copyright © Genetic Computer School 2009 Chapter 3 Relational Database Model.
Ontology-Driven Automatic Entity Disambiguation in Unstructured Text Jed Hassell.
Management Information Systems MS Access MS Access is an application software that facilitates us to create Database Management Systems (DBMS)
© 2001 Business & Information Systems 2/e1 Chapter 8 Personal Productivity and Problem Solving.
Theory and Application of Database Systems A Hybrid Approach for Extending Ontology from Text He Wei.
MIS 3053 Database Design & Applications The University of Tulsa Professor: Akhilesh Bajaj RM/SQL Lecture 1 ©Akhilesh Bajaj, 2000, 2002, 2003, All.
© Pearson Education Limited, Chapter 9 Logical database design – Step 1 Transparencies.
Dimitrios Skoutas Alkis Simitsis
MS Access 2007 Management Information Systems 1. Overview 2  What is MS Access?  Access Terminology  Access Window  Database Window  Create New Database.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
DataBase Management System What is DBMS Purpose of DBMS Data Abstraction Data Definition Language Data Manipulation Language Data Models Data Keys Relationships.
GUI Design Spreadsheet-Based Decision Support Systems Chapter 23: Aslı Sencer MIS 463.
Microsoft ® Office Excel 2003 Training Using XML in Excel SynAppSys Educational Services presents:
INFO275 Database Management Term Project. Overview Your project will be to define, design and build a functioning database, to support an application.
Word Create a basic TOC. Course contents Overview: table of contents basics Lesson 1: About tables of contents Lesson 2: Format your table of contents.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Copyright © 2007, Oracle. All rights reserved. Managing Items and Item Catalogs.
Methodology - Logical Database Design. 2 Step 2 Build and Validate Local Logical Data Model To build a local logical data model from a local conceptual.
LECTURE TWO Introduction to Databases: Data models Relational database concepts Introduction to DDL & DML.
IT 5433 LM2 ER & EER Model. Learning Objectives: Explain importance of data modeling Define and use the entity-relationship model Define E/R terms Describe.
Ontology Engineering and Feature Construction for Predicting Friendship Links in the Live Journal Social Network Author:Vikas Bahirwani 、 Doina Caragea.
Chapter 5 Database Design
Methodology Logical Database Design for the Relational Model
Entity-Relationship Model
COS 346 Day 8.
Database.
CSc4730/6730 Scientific Visualization
Understand and Use Object Oriented Methods
A Graph-Based Approach to Learn Semantic Descriptions of Data Sources
A Tool to Support Ontology Creation based on Incremental Mini-Ontology Merging Zonghui Lian Supported by.
Presentation transcript:

Semi-Automatic Generation of Mini-Ontologies from Canonicalized Relational Tables Chris Hathaway

Introduction Ontologies are an important tool for realizing the vision of the semantic web Major setback is their creation and upkeep Must be created by experts Experts are biased in knowledge, agreement needed Ontologies continually change; upkeep a massive task Some automation is needed

Introduction (cont’d) Current attempts at automatic generation of ontologies not successful, because extracted from free-form, unstructured text. A more effective alternative is to extract ontologies from structured data on the web (tables, charts, etc.) TANGO project Part 1: Extract tables from the web Part 2: Define mini-ontologies from tables Part 3: Merge into growing domain ontology

Process Overview Start out with canonicalized table Generate likely candidates for: Object Sets Relationship Sets Functional Constraints Inclusion Constraints/Hierarchical Structure Get help from user when needed Choose best candidate for the ontology

Thesis Statement Currently, the generation of effective ontologies has been unsuccessful because of the free-form style of Web information By extracting concepts, constraints, and hierarchies from individual tables, we can create mini-ontologies that can later be merged into a domain ontology Success can only be determined subjectively as to the correctness of the generated ontologies.

Example 1: Generate Concepts Create list of candidate concepts (usually column names)

Example 1: Generate Concepts Determine lexicalization (columns with associated values are lexical)

Example 1: Generate Concepts Current ontology

Example 1: Generate Relationships Decide relationship sets Exponential number of combinations Basic assumption: one main concept relates to all others (attributes) Goal: find central column of interest

Example 1: Generate Relationships Look for mapping between one column and title of table

Example 1: Generate Relationships Current ontology

Example 1: Generate Constraints FDs and Participation Constraints FD definition: X → Y iff (X[i] = X[j]) → (Y[i] = Y[j]) for all row indexes i and j. Unless solid case (two or more same values), only consider FDs from central object to attributes Use heuristics for setting exact participation (0:1,1:*, etc)

Example 1: Generate Concepts Numerical values are usually functionally determined by column of interest and have 0:* participation constraint.

Example 1: Generate Constraints Completed mini-ontology

Example 2: Generate Concepts SubFamily, Group, and SubGroup are generic types Enumerate column values as object sets because less than 5 divisions (recursively)

Example 2: Generate Relationships Found mapping of central column of interest to title (Language) Exceptions to basic assumption Hierarchy (enumerated object sets) Transitive FDs (X → Y, Y → Z, remove X → Z) Create ISA hierarchy from table structure

Example 2: Generate Relationships Current ontology

Example 2: Generate Hierarchical Constraints Assign members to each object set for easy calculation Find inclusion dependencies: Union – All members of parents are members of one or more child Intersection (Less common) – Child members are always in both parents Mutual exclusion – Intersection of any two child members is empty.

Example 2: Generate Hierarchical Constraints Completed mini-ontology

Getting Help from the User Sometimes human intervention is required to move on in generation process Effective use of the user’s input will rely on IDS statements: Issue: explains the problem (Ex. No central object was found in the table) Default: describes default behavior (Ex. A new non-lexical object named Object will be created) Suggestion: suggests an action for the user to follow (Ex. Either choose the column central to describing the table, or name the new object set something appropriate)

Choosing the Best Ontology Even with given guidelines, a large set of possible mini-ontologies could still remain Two options: Ask the user a few of the most limiting questions to reduce set to a small number Rank the ontologies according to how well they follow guidelines and how they compare to other tables, domain ontology Pass smaller set to the merging process

Contributions to Computer Science Provides larger resource for ontology based information extraction Quick and effective way of gathering information from the Web Semi-automatic tool for generating useful ontologies, useful for the goals of the semantic web