Presentation is loading. Please wait.

Presentation is loading. Please wait.

DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015.

Similar presentations


Presentation on theme: "DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015."— Presentation transcript:

1 DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015

2 Agenda Brief Introduction KBC model Workflow Reference DeepDive: A Data Management System for Automatic Knowledge Base Construction. Ce Zhang.Ph.D. Dissertation, University of Wisconsin-Madison, 2015.

3 Brief Introduction What is Deep Dive? DeepDive is a new type of data management system that enables one to tackle extraction, integration, and prediction problems in a single system. It is built by generalizing from experience in building more than ten high-quality Knowledge Base Construction (KBC) systems. (Flexible framework) What is KBC? Knowledge Base Construction (KBC) is the process of populating a knowledge base (KB).

4 Brief Introduction Why Deep Dive? Or Why KBC? Its potential to answer key scientific questions. --- Collect facts, contribute to scientific discoveries Typical knowledge base require a large amount of resource. Common problems in scientific area.

5 Brief Introduction Why Deep Dive? Or Why KBC? Its potential to answer key scientific questions. Typical knowledge base require a large amount of resource. Its good performance. ---- Developer thinks about features (extraction rules), not algorithms. ---- Large amounts of data from a variety of sources; ----High quality in extracting complex knowledge and building entity relation; ---- Calibrated probabilities for each assertion it makes; ----Domain knowledge + framework Deep Dive = KBC system in specific domain

6 Brief Introduction General description Deep Dive ----The application of Relational database. All data in Deep Dive is stored in a relational database. ----The main task it to figure out the relation & entity. ---- A selection of target facts typically defined for an IE task. ---- Multiple non-content cues such as layout information may be used to assist extraction, e.g. section headers or their layout in tabular data. ----Extract all kinds of information about the entity and relation, high data volume.

7 Agenda Brief Introduction KBC model Workflow

8 KBC model  Entity: An entity is a real-world person, place, or thing. ----For example, the entity “Michelle Obama 1” represents the actual entity for a person whose name is “Michelle Obama”.  Relation: A relation associates two (or more) entities. ----For example, the entity “Barack Obama 1” and “Michelle Obama 1” participate in the HasSpouse relation, which indicates that they are married.  Mention: a mention is a span of text in an input file that refers to an entity or relationship. ---- “Michelle” may be a mention of the entity “Michelle Obama 1.”  Relation Mention: A relation mention is a phrase that connects two mentions that participate in a relation. ---- “and his wife” =“Barack Obama” and “M. Obama”.

9 KBC model

10 Agenda Brief Introduction KBC model Workflow

11 Work Flow

12  Input file

13 Work Flow  Input file  User Schema

14 Work Flow  Candidate Generation Feature Extraction

15 Work Flow  Candidate Generation & Feature Extraction

16 Work Flow  Supervision (1) hand-labeling, and (2) distant supervision

17 Work Flow  Supervision (1) hand-labeling, and (2) distant supervision

18 Work Flow  Learning and Inference  In the learning and inference phase, Deep Dive generates a factor graph.

19 Work Flow  Learning and Inference

20 DeepDive Resource http://deepdive.stanford.eduhttp://deepdive.stanford.edu. https://www.youtube.com/watch?v=SfkLvExfl-s http://pages.cs.wisc.edu/~czhang/zhang.thesis.pdf

21 Thank you! Q&A


Download ppt "DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015."

Similar presentations


Ads by Google