Presentation is loading. Please wait.

Presentation is loading. Please wait.

Construction of Enterprise Knowledge Graphs

Similar presentations


Presentation on theme: "Construction of Enterprise Knowledge Graphs"— Presentation transcript:

1 Construction of Enterprise Knowledge Graphs
Chapter 4

2

3 Outline Knowledge graph lifecycle Ontology authoring
Semi-atomated linking of Enterprise Data for Virtual knowledge graph Fokus på å lage knowledge graphs med menneskelig innblanding

4 Outline Knowledge graph lifecycle Ontology authoring
Semi-atomated linking of Enterprise Data for Virtual knowledge graph Fokus på å lage knowledge graphs med menneskelig innblanding

5 A general lifecycle

6 1 Specification Draw up detailed specification One of the main tasks
1) Identification and analysis of data sources 2) URI design Select data to integrate and publish Data that exists in the organization Needed external data

7 URI design Put as much information into the URI as possible
< Use slash instead of hash URI whenever possible Separate TBox (ontology model) from ABox (instances) TBox: Append ”ontology” to base URI (ontology/Person) ABox: Append ”resource” to base URI (resource/Erna)

8 (2) Modelling Determine ontology to be used for modelling of the domain Reuse as much as possible If no suitable ontology is found, reuse parts If nothing works out, start from scrach (follow NeOn methodology)

9 A general lifecycle

10 (3) Data lifting Transfer existing data to RDF Two main activities
Transformation Linking GRDDL, RDBS2RDF

11 Transformation Requirements
Full conversion – queries on original data source must be possible on RDF version RDF instances should reflect target ontology structure (as closely as possible) RDB2RDF, GRDDL, Google Refine/OpenRefine (RDF extension), D2R Server, ODEMapster, Stats2RDF

12 Linking Create links between our knowledge graph and external graphs
Steps: 1. Identify KG’s that are suitable as linking targets - manual 2. Discover relationships between items in our- and external KG – tools exists 3. Validate relationships – performed by domain experts Finnes lager med KG’s på ”Linked Data repositories” som CKAN – manuelt

13 4 Data publication Activities Knowledge graph publication
Metadata publication

14 Knowledge graph publication
Store and publish RDF data Virtuoso Universal Server Jena Sesa 4Store YARS Some already include SPARQL endpoints

15 Metadata publication Include metadata information about the KG
Data about structure Data about access Descirption of links between knowledge graphs

16 A general lifecycle

17 Data Curation Aims at maintaining and preserving data for reuse over time Cleaning noise Identify errors (40x/50x errors) Broken links Malformed data types (”true” as xsd:int) Bevaring

18 Outline Knowledge graph lifecycle Ontology authoring
Semi-atomated linking of Enterprise Data for Virtual knowledge graph Fokus på å lage knowledge graphs med menneskelig innblanding.

19 Ontology Authoring - A compentency question-driven approach
Real-world ontologies requires manual constructions Requires deep and complex professional knowledge Onthology authors are domain experts not KG experts Onthology authoring is time-consuming and error prone Solution: ”Competency question-driven ontology authoring” (CQOA)

20 Competency Questions Ontology must be able to answer competency questions (CQ) Natural language sentences Semiformal pattern: ”Which [CE1][OPE][CE2]?” Examples: ”Which mammals eat grass?” (animal ontology) Which processes implement an algorithm” (Software engineering ontology) CQs are especially helpful to ontology authors

21 Presuppositions ”A special condition that must be met for a linguistic expression to have a denotation” Example: ”Which processes implement an algorithm?” Ontology must satisfy the following presuppositions: Classes ”Process”, ”Algorithm” and property ”Implements” occurs in ontology Ontology allows ”Process” to implement ”Algorithm” Ontology allows ”Process” to not implement ”Algorithm”

22 Formulation of competency questions
Selection: ”Which mammals eat grass?” Binary: Should answer the question with a boolean value (yes/no) Counting question: Should answer with a number. ”How many pizzas has ham or chicken as topping?” Question Polarity: ”Which pizza has no vegetables?” Predicate arity: ”Is it thin or thick bread?” Modifier: ”If I have 3 ingredients, how many pizzas can I make?” Selection question Binary question Counting question Question Polarity Predicate Arity Modifier

23 Test suite of CQs Table 4.1 (p. 99)

24 Outline Knowledge graph lifecycle Ontology authoring
Semi-atomated linking of Enterprise Data for Virtual knowledge graph Fokus på å lage knowledge graphs med menneskelig innblanding

25 Semi-automated linking of Enterprise Data for knowledge graphs
Activity is part of the ”Data lifting” step in the life cycle Create data linkage Helix: linking information sources Build a knowledge graph for data discovery

26 Techniques of data discovery
Normalize data in different format Index structured data in tables Perform semantic matching between schema elements of structured data Tag data with semantic tags Find linkage points in the data so that users can join between tables

27 Helix input sources Semi-structured sources (API / RDBMS, triple stores) Online or local file stores Online web API’s

28 Helix pre-processing Implemented in the HADOOP ecosystem
1. Schema discovery 2. Full-text indexing 3. Linkage discovery Output: Semantically tagged Global Schema Graph

29 Linkage discovery All-to-all instanced based matching of all attributes Does not scale Turn the problem into IR-problem

30 Linkage discovery example
Si noe om skoler som hadde stemmelokaler. I NY brukte kan KG til å finne fram til sykehus ved hjelp av graf-traversering i stedet for fritekst-søk.

31


Download ppt "Construction of Enterprise Knowledge Graphs"

Similar presentations


Ads by Google