Download presentation
Presentation is loading. Please wait.
Published byIra Simon Modified over 9 years ago
1
DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015
2
Agenda Brief Introduction KBC model Workflow Reference DeepDive: A Data Management System for Automatic Knowledge Base Construction. Ce Zhang.Ph.D. Dissertation, University of Wisconsin-Madison, 2015.
3
Brief Introduction What is Deep Dive? DeepDive is a new type of data management system that enables one to tackle extraction, integration, and prediction problems in a single system. It is built by generalizing from experience in building more than ten high-quality Knowledge Base Construction (KBC) systems. (Flexible framework) What is KBC? Knowledge Base Construction (KBC) is the process of populating a knowledge base (KB).
4
Brief Introduction Why Deep Dive? Or Why KBC? Its potential to answer key scientific questions. --- Collect facts, contribute to scientific discoveries Typical knowledge base require a large amount of resource. Common problems in scientific area.
5
Brief Introduction Why Deep Dive? Or Why KBC? Its potential to answer key scientific questions. Typical knowledge base require a large amount of resource. Its good performance. ---- Developer thinks about features (extraction rules), not algorithms. ---- Large amounts of data from a variety of sources; ----High quality in extracting complex knowledge and building entity relation; ---- Calibrated probabilities for each assertion it makes; ----Domain knowledge + framework Deep Dive = KBC system in specific domain
6
Brief Introduction General description Deep Dive ----The application of Relational database. All data in Deep Dive is stored in a relational database. ----The main task it to figure out the relation & entity. ---- A selection of target facts typically defined for an IE task. ---- Multiple non-content cues such as layout information may be used to assist extraction, e.g. section headers or their layout in tabular data. ----Extract all kinds of information about the entity and relation, high data volume.
7
Agenda Brief Introduction KBC model Workflow
8
KBC model Entity: An entity is a real-world person, place, or thing. ----For example, the entity “Michelle Obama 1” represents the actual entity for a person whose name is “Michelle Obama”. Relation: A relation associates two (or more) entities. ----For example, the entity “Barack Obama 1” and “Michelle Obama 1” participate in the HasSpouse relation, which indicates that they are married. Mention: a mention is a span of text in an input file that refers to an entity or relationship. ---- “Michelle” may be a mention of the entity “Michelle Obama 1.” Relation Mention: A relation mention is a phrase that connects two mentions that participate in a relation. ---- “and his wife” =“Barack Obama” and “M. Obama”.
9
KBC model
10
Agenda Brief Introduction KBC model Workflow
11
Work Flow
12
Input file
13
Work Flow Input file User Schema
14
Work Flow Candidate Generation Feature Extraction
15
Work Flow Candidate Generation & Feature Extraction
16
Work Flow Supervision (1) hand-labeling, and (2) distant supervision
17
Work Flow Supervision (1) hand-labeling, and (2) distant supervision
18
Work Flow Learning and Inference In the learning and inference phase, Deep Dive generates a factor graph.
19
Work Flow Learning and Inference
20
DeepDive Resource http://deepdive.stanford.eduhttp://deepdive.stanford.edu. https://www.youtube.com/watch?v=SfkLvExfl-s http://pages.cs.wisc.edu/~czhang/zhang.thesis.pdf
21
Thank you! Q&A
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.