Download presentation
Presentation is loading. Please wait.
Published byMelanie Gibson Modified over 9 years ago
1
인공지능 특강 프로젝트 - Development of Decision Tree Algorithm for Semantic Web data - 2010313148 전동규
2
2 Agenda 1.Project Purpose 2.Motivation 3.Related Work 4.Algorithm 5.Description of Problem
3
3 Relational Data - Semantic Web Data (Linked data) - Decision Tree Algorithm - C4.5 Algorithm - Semantic Decision Tree 1. Project Purpose
4
4 Goal of project Development of new kind of Decision Tree algorithm which supports decision making based on Semantic Web environmental information Solve the several problems which is already solved by other related researches Data : Linked Data(Semantic Web ontology)
5
5 1. Project Purpose Ontology Class : Definition of Set Property : Relations between instances Instance : Individuals which are belonged in classes Schema of Example Ontology Datatype Property Object Property Class Range type Boolean Int String Rich name working_at hasParent Person Workplace Location hasChild Person Doctor Teacher Student Person Hospital School age String name
6
6 Definition of Semantic Web The Semantic Web is an evolving development of the World Wide Web in which the meaning (semantics) of information and services on the web is defined, making it possible for the web to understand and satisfy the requests of people and machines to use the web content Semantic Web is based on ontologies which corresponds to Semantic Web data Linked Data The term Linked Data is used to describe a method of exposing, sharing, and connecting data on the Web What is Semantic Web ? [1] Berners-Lee, T. (2001), “ The Semantic Web, ” Scientific American, Vol. 501. 2. Motivation
7
7 Increase of Semantic Web Data Appearance of Semantic Web Document Search Engines Falcons : Twenty millions over RDF/XML Documents Swoogle : Three millions over Semantic Web Documents Open data in Semantic Web LINKINGOPENDATA : The goal of the W3C SWEO Linking Open Data community project is to extend the Web with a data commons by publishing various open data sets as RDF on the Web and by setting RDF links between data items from different data sources The data sets consist of over 4.7 billion RDF triples, which are interlinked by around 142 million RDF links (May 2009). Development status of Semantic Web DateRDF/XMLQuadruple Falcons 2009-09-02 21,639,3372,936,868,638 2009-05-2919,919,3642,177,084,709 DateSemantic Web DocumentTriple Swoogle 2009-10-173,109,6161,065,799,526 2. Motivation
8
8 Increase of Semantic Web Data Semantic Web based Portal Site Twine : Twine is a Semantic Web Portal that making networks of information based on user’s posts which consist of their own information and favorite things. Every information composing Twine is written in RDF and OWL format. Twine have millions of visitors in a month, and they have over millions of relationships between 3 millions of semantic tags (March 2009) The necessity of mining useful knowledge from huge size ontology is highly expected. Therefore, Data Mining methodology for Semantic Web should be ready for this necessity. Development status of Semantic Web 2. Motivation
9
9 Traditional Decision Tree algorithm is impossible to apply in Semantic Web Semantic Web based Ontology has special characteristics for mining Since Semantic Web document has network structure, multi-value issue is occurred Traditional Decision Tree just uses value of variables. Therefore, additional information of Semantic Web are can not be applied Converting Semantic Web data into single table style that used to use in traditional decision tree algorithm is impossible Decision Tree in Semantic Web 2. Motivation
10
10 Arno J.Knobbe[2] developed decision tree algorithm for Multi-relational database Selection Graph is suggested to do decision tree on RDB Selection Graph is composed of Node, Edge, and condition and it can be expressed in SQL syntax 3. Related Work This research suggested partial solution about multi-value issue which also happened in Semantic Web ontology. However, this methodology can not be applied to Semantic Web which contains a lot of information than RDB David Jensen[3] suggested methodology that converting social network data to single table data which can be applied to Traditional Decision Tree algorithm ‘QGraph' that kind of query language to get the local network from entire social network is suggested QGraph is composed of Node, Edge, and condition and it can query many objects at once Since ontology information are manually converted to single table form, missing information will be occurred a lot [2] Arno J. Knobbe., Arno Siebes., Danil Van Der Wallen., Syllogic B. V. (1999). “ Multi-Relational Decision Tree Induction, ” In Proceedings of PKDD ’ 99, [3] D. Jensen., and J. Neville.(to appear) (2002). “ Data mining in networks, ” Papers of the Symposium on Dynamic Social Network Modeling and Analysis.
11
11 4. Algorithm Search procedure of algorithm follows C4.5 algorithm New methodologies are required to learn concepts in ontology ‘Constructor’ can be used as similar as attributes in traditional Decision tree Related works used the terms ‘Refinement’ as an attribute in Decision Tree
12
12 4. Algorithm What is a Refinement? Refinement is a condition for split branches in decision tree. In this algorithm, property and class from ontology are used as a refinement. When define a refinement, Role Constructors from Description Logic are applied to make the best use of information in Semantic Web Type of Refinements Concept Constructor Refinement : Applying type information of instances Cardinality restriction Refinement : Applying cardinality information on object property Domain restriction Refinement : Applying value of datatype property Qualification Refinement : Applying information of quantification restrictions and range class of object property Refinements
13
13 Refinements Example Concept Constructor Refinement Hospital not Human Domain restriction Refinement Age.(≥ 21) Cardinality restriction Refinement ≥ 3 hasChild Qualification Refinement hasChild.Blond 4. Algorithm Developed Refinements
14
14 The list of syntax information which can be expressed in ontology LanguageSyntax RDF rdf:type RDFS rdfs:domain rdfs:range rdfs:subClassOf rdfs:subPropertyOf OWL owl:AllDifferent owl:allValuesFrom owl:cardinality owl:Class owl:complementOf owl:DatatypeProperty owl:DataRange owl:differentFrom owl:disjointWith owl:equivalentClass LanguageSyntax OWL owl:equivalentProperty owl:FunctionalProperty owl:hasValue owl:intersectionOf owl:InverseFunctionalProperty owl:inverseOf owl:masCardinality owl:minCardinality owl:ObjectProperty owl:oneOf owl:sameAs owl:someValuesFrom owl:SymmetricProperty owl:TransitiveProperty owl:unionOf 4. Algorithm
15
15 5. Description of Problem Train problem Ten trains After learning, found definition of eastbound train is as follows
16
16 5. Description of Problem Artificial task of learning to predict whether a train is headed east or west Data is consist of relation tuples Relations eastbound(T) : train T is eastbound has-car(T,C) : C is a car of T infront(C,D) : car C is in front of D long(C) : car C is long open-rectangle(C) : car C is shaped as an open rectangle similar relations for five other shapes jagged-top(C) : C has a jagged top sloping-top(C) : C has a sloping top open-top(C) : C is open contains-load(C,L) : C contains load L 1-item(C) : C has one load item similar relations for two and three load items 2-wheels(C) : C has two wheels 3-wheels(C) : C has three wheels Train problem
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.