Presentation is loading. Please wait.

Presentation is loading. Please wait.

TARTAR Information Extraction Transforming Arbitrary Tables into F-Logic Frames with TARTAR Aleksander Pivk, York Sure, Philipp Cimiano, Matjaz Gams, Vladislav.

Similar presentations


Presentation on theme: "TARTAR Information Extraction Transforming Arbitrary Tables into F-Logic Frames with TARTAR Aleksander Pivk, York Sure, Philipp Cimiano, Matjaz Gams, Vladislav."— Presentation transcript:

1 TARTAR Information Extraction Transforming Arbitrary Tables into F-Logic Frames with TARTAR Aleksander Pivk, York Sure, Philipp Cimiano, Matjaz Gams, Vladislav Rajkovic, Rudi Studer Presented By Stephen Lynn

2 TARTAR Information Extraction  Free-form Text  Linguistic/NLP approaches  Tabular Structures  Table comprehension task  html, excel, pdf, text, etc.  Semantic interpretation task  More effort???

3 TARTAR Information Extraction TARTAR Architecture

4 TARTAR Information Extraction Semantic Representation  Frame Logic (F-Logic)  Model-theoretic semantics  Complete resolution-based proof theory  Expressive power of logic  Availability of efficient reasoning tools

5 TARTAR Information Extraction F-Logic Frame

6 TARTAR Information Extraction

7 TARTAR Information Extraction Table Comprehension  Dimensions – a grouping of cells representing similar entities

8 TARTAR Information Extraction Table Comprehension  Stub – dimension with headers used to index elements in body

9 TARTAR Information Extraction Table Comprehension  Box head – column headers (often nested)

10 TARTAR Information Extraction Table Comprehension  Body – data values

11 TARTAR Information Extraction Table Classes  1D, 2D, Complex

12 TARTAR Information Extraction Methodology

13 TARTAR Information Extraction Cleaning & Canonicalization  Clean DOM tree  CyberNeko HTML Parser  Rowspan/Colspan expansion

14 TARTAR Information Extraction Structure Detection  Token Type Hierarchy  Assign Functional Types and Probabilities

15 TARTAR Information Extraction Structure Detection  Detect Logical Table Orientation

16 TARTAR Information Extraction Structure Detection  Discover and Level Regions  Logical Units

17 TARTAR Information Extraction FTM Building  Functional Table Model (FTM)  Arrange regions into a tree  Leaf nodes are data

18 TARTAR Information Extraction Semantic Enriching of FTM  Labeling  WordNet and GoogleSets  Map FTM to a frame

19 TARTAR Information Extraction Evaluation  Crawl, extract, filter web tables  135 tables  85.4% success rate  Mostly problems with complex tables  Compare auto-generated frames with human generated frames  14 people transformed 3 tables each  21 total tables (each done twice)  Syntactic/Semantic correctness (Strict and Soft)

20 TARTAR Information Extraction Results Inter-annotator agreement System-annotator agreement

21 TARTAR Information Extraction Benefits  Fully automated knowledge formalization  Arbitrary tables  Independent of domain knowledge  Independent of document type  Explicit semantics of generated frames  Query answering over heterogeneous tables


Download ppt "TARTAR Information Extraction Transforming Arbitrary Tables into F-Logic Frames with TARTAR Aleksander Pivk, York Sure, Philipp Cimiano, Matjaz Gams, Vladislav."

Similar presentations


Ads by Google