Download presentation
Presentation is loading. Please wait.
Published byDuane Owens Modified over 9 years ago
1
Ontologies? Semantic Web? OWL? – Making sense of it all
Presenters INTEC Broadband Communication Networks (IBCN) Department of Information Technology (INTEC) Ghent University - IBBT
2
Ontology The ontology cloud Formal logic SWRL Reasoning SPARQL
Most of you here have heard the word ontology and immediately some words pop into your mind that are related to it like … In the first part of this presentation I want to give a clear view of all these terms and technologies and how they are related to each other Then Matthias will give you a short tutorial of how you can quickly get started with ontologies, how do you built them and take advantage of your features Stijn will go into the advantages of ontologies over other technologies, the tools that are out there. Semantic Web RDF OWL
3
The evolution of the Web
Intelligent Web Web 4.0 Web OS Intelligent personal agents Semantic Web Web 3.0 Distributed Search SWRL OWL SPARQL Semantic Databases OpenID AJAX Connections between Information Social Web Semantic Search ATOM Widgets P2P RDF RSS Mashups Javascript Web 2.0 Office 2.0 Flash SOAP XML Weblogs Social Media Sharing The Web Java HTML HTTP SaaS Social Networking Directory Portals Wikis VR Web 1.0 Keyword Search Lightweight Collaboration The PC BBS Gopher Websites MacOS SQL MMO’s Groupware SGML Databases Windows Web 3.0. : not defined, neither was Web 2.0, these are meaningless terms He tried to come up with a definition: its a decade and its this coming third decade of the web, focus of this decade is about enriching the structure of the web and transforming the web from something that is today very much like a fileserver into something that is more like a database. This shows the different decades and over time the connections between people and the connections between information evolving and getting richer. End of PC era focus was on the front-end, the user experience of the pc with things like windows and mac interface, information and people were somewhat connected in these devices But when we got into the first decade of the web we saw a big increase in how we can connect information and share it and connect to each other on the web, that was focussed more on the back-end of the web Now we’re in the decade of web 2.0. It has been about the front-end of the web, the emphasis here is on the user experience such as AJAX, tagging, social networking , really making the web more usable So you can see there is a pendulum, its swinging each decade from the back-end to the front-end So web 3.0 will be more about the back-end again: it’s about a fundamental upgrade to the infrastructure of the web, we’re upgrading how we represent and share data, that’s really the big trend. So web 4.0 will swing the pendulum back to the front-end again: based on the richer data, this more open data that’s going to be all over the web and accessible to any application, then we can make smarter interfaces and smarter tools, so web 4.0 will be about a smarter web and a smarter user-experience The Internet File Servers PC Era FTP IRC USENET PC’s File Systems Connections between people
4
The limitations of keyword search
The Intelligent Web Web 4.0 Reasoning Productivity of Search The Semantic Web Web 3.0 Semantic Search The Social Web Natural language search Web 2.0 The World Wide Web Tagging Web 1.0 Keyword search The Desktop PC Era Directories Files & Folders As the amount of data keeps increasing, we’ve seen an interesting curve in productivity In the pc era productivity went way up with pcs and as the web came out it continued But then it levelled off because its never been easier to create and publish information, everyone can do it, so the volume of information is rapidly outgrowing the tools that we have for managing that Today the state-of-the-art for managing the data is keyword search e.g. Google Keyword search is OK but if the information explosion continues it just won’t keep up, you have to have a smarter way of managing content One approach for example is tagging, adding a little bit of metadata, metadata is data about data, tags are data that describe your data Next up are things like natural language search where you can understand the data a little bit better, what does this person really mean when they type this search query, what does this document really contain, what is it about Semantic search goes a little farther, it doesn’t only understand the meaning of things but also the connections between things Finally we’ll get to reasoning, logic and AI are for the future These technologies will enable us to get past the barrier of keyword search which has reached its limit, its not going to get better, its actually going to get worse and we have to find a better way to remain productive, so that’s where semantics become very valuable Databases Amount of data
5
Semantic Web – Adding meaning to data
Different methods to add semantics to data: Tagging Statistics Linguistics Ontology – Semantic Web AI Semantic Web: Set of open standards by the W3C to add semantics (meaning) to data Tags are easy to add anyone can do it: is pro and con: because anyone can do it and the tags are kind of meaningless, they are just strings, people can add bad tags In fact it turns out that human beings are very lazy, they are inconsistent, if you ask someone to tags things, the same person will tag things differently although they are about the same thing So the only way to really make sense of tags is to use statistics over very large amounts of content and tags, over many many users, if you do that you can use tags to understand the data and actually infer the meaning of the data through the tags, but you can only do this with really large datasets What’s great about that is that it’s pure math, it doesn’t have to understand the language, user doesn’t have to do anything, it’s massively scalable, it works across any language, it’s basically very easy to do. The problem is that it doesn’t really understand the information so it’s very hard to write a good query, in fact when you use google, you do a query and then you end up doing it again and sort of playing a guessing game with google, you spend a lot of time just trying to figure out how to ask the question to get what you were trying to find, that’s the problem with keyword search What you really need is a system that understands what you are asking for and maybe even find things that you didn’t even ask for but that he’s knows you’re looking for So the statistical approach has limits It’s like natural language search It actually understands the meaning of text, they will look at a document, they will analyze the grammar, they will use rules, they will use linguistics and they will try to figure out “what exactly does this sentence mean?”, “what does this paragraph mean?”, to actually understand the language and they read the text, it’s kind of like a human would do So this is great, but it’s very computationally intensive, it makes a lot of mistakes, it is language-dependent (you have to program it differently for every language), it is very hard to scale These standards are very particular technologies and that’s the actual semantic web opposed to just semantics, there are many semantic technologies, but only one set of semantic technologies is the semantic web What does semantic mean? It’s ironic that this term is so ambiguous, but semantic means “meaning”, it’s a strange little joke So semantic technologies are technologies that understand what things mean or that express or infer or represent the meaning of things, that’s all it is. The semantic web is one way to do that and it’s about metadata So in the semantic web appraoch you’re creating metadata that you’re putting into the data and it describes the meaning of your data. And the benefit of that is that this metadata is open, it’s done with open standards, so any application can then come and reuse that metadata So if one application does some work, creates some metadata about something, it can put that in a place that ohter applications can then use, so everybody starts to benefit from everybody else his work, at least that’s the idea. The problem is that there aren’t that many tools today and it’s hard to scale the technology today and who is going to make all this metadata? The holy grail, hal9000, arthur C. clarck, software that really thinks That’s going to come, but it’s going to be at least a few decades Cycorp: spend about 15 years manually entering in all human knowledge and they haven’t finished yet, it’s a big big problem and the biggest problem is that the knowledge they entered 15 years ago is not always still correct or valid, there’s all this new stuff that happened, so keeping up with the pace of human knowledge is a big challenge I’m not sure there approach ultimately will scale The wikipedia is actually an interesting appraoch if someone would put an expert system or AI system on top of it, that might be a better way to keep up with the evolution of human consensus regaring the reality, but while there’s a lot of research, nobody has done it yet. I do think AI will be huge in web 4.0
6
Ontology - OWL “An ontology is a specification of a conceptualization in the context of knowledge description” has_topping * Pizza Meat Is a Spiciness Salami Kort tonen, verder verwijzen naar matthias Vegetarian Pizza Pizza Not(has_topping some Meat)
7
Ontology - OWL Structured knowledge representation
Domain Application Sharing – Reuse Support communication Capture knowledge formally Reasoning Extract new knowledge Kort aanhalen verder verwijzen naar matthias (logicall defined classes) and stijn (advantages)
8
RDF – Store data as “triples”
Femke IBCN Works_at Predicate Subject Object the subject, which is an RDF URI reference or a blank node the predicate, which is an RDF URI reference the object, which is an RDF URI reference , a literal or a blank node Oversimplified in this presentation Basic unit of data is called a triple It has three parts: a subject, a predicate and an object E.g. Susan works_for IBM These simple statements are the fundamental building blocks of knowledge and data So that’s how RDF actually is structured, it structures the data very much like we talk The key thing is that each of these elements may have a URI like a URL A URI points to some location on the web where there’s further information So when you make a statement, you can say Susan, who is actually a URI that represents a data record that describes Susan, works_for, which has an URI that defines what you mean with works_for somewhere in an ontology, IBM, which has an URI which point to a representation of IBM Now these three things could be in different places: the susan record can be in database A, the IBM record can be in database B, and the predicate could be another data record in database C that simply links them together So it’s a giant mach-up on a very atomic level of data Now what’s nice about this is that you can then start to link these data records together. So this is kind of illustrating how these data records are all connected whether its within one application or across applications, it becomes an open database just like the web but for data So to the question is there a better term than semantic web? Yeah, it’s data web.
9
The Semantic Web The social graph just connects people
The semantic graph connects everything… s Companies Products Services Web Pages Multimedia Documents Events Projects Activities Interests Places People Groups Better search More targeted ads Smarter collaboration Deeper integration Richer content Better personalization Connects everything....things on the side are the nouns...the lines are the verbs...how are these things connected Social graph is a part of the semantic graph...it belongs to it...semantic graph contains so much more So semantic web is like a social graph but connecting more things together Provides a lot of opportunities: improve search, targetting of advertising, better collaboration, improves integration between datasets and applications, richer content and much better personalizations. Why? Because we can represent explicitly different kind of things and their relationsships, the same way we’re doing in social networks with just people. Example: you could see how a product is connected to a company, to other another product, how an event is related to another event; to a document, who attended that event, who sponsored it, what product were there, and navigate the graph or the network of all these connections That is what the semantic web is going to create.
10
SWRL - SPARQL SWRL Define rules by using domain concepts
Add more expressivity then pure OWL Person(?p) ^ hasSalaryInPounds(?p, ?pounds) ^ swrlb:multiply(1.9, ?pounds, ?dollars) -> hasSalaryInDollars(?p, ?dollars) SPARQL Query data Similar to SQL but optimized for RDF data PREFIX foaf: < SELECT ?name WHERE { ?person foaf:mbox . ?person foaf:name ?name . }
11
Layered cake of the Semantic Web
Reasoning SWRL & SPARQL OWL Data triples RDF: main standard, the way data is represented with things that are called triples OWL: built on RDF, RDF with some more statements in it, some more expressive power for defining schemas SPARQL: query language, like SQL but for RDF SWRL: rule language Some other rule language for the SW exist also, none of these have been standardized GRDDL: for transforming data so that you can say “this is how you can take this xml data and turn it into RDF on to fly” You can make these GRDDL profiles for websites that enable anyone who wants to see your site in RDF to get the RDF immediately
12
Tutorial: Building an OWL Ontology
Department of Information Technology – Broadband Communication Networks (IBCN)
13
Named & Disjoint Classes
OWL Classes are assumed to ‘overlap’. We therefore cannot assume that an individual is not a member of a particular class simply because it has not been asserted to be a member of that class. In order to ‘separate’ a group of classes we must make them disjoint from one another. This ensures that an individual which has been asserted to be a member of one of the classes in the group cannot be a member of any other classes in that group. In our above example Pizza, PizzaTopping and PizzaBase have been made disjoint from one another. This means that it is not possible for an individual to be a member of a combination of these classes – it would not make sense for an individual to be a Pizza and a PizzaBase! Department of Information Technology – Broadband Communication Networks (IBCN)
14
Class Hierarchy Up to this point, we have created some simple named classes, some of which are subclasses of other classes. The construction of the class hierarchy may have seemed rather intuitive so far. However, what does it actually mean to be a subclass of something in OWL? For example, what does it mean for VegetableTopping to be a subclass of PizzaTopping, or for TomatoTopping to be a subclass of VegetableTopping? In OWL subclass means necessary implication. In other words, if VegetableTopping is a subclass of PizzaTopping then ALL instances of VegetableTopping are instances of PizzaTopping, without exception — if something is a VegetableTopping then this implies that it is also a PizzaTopping. Department of Information Technology – Broadband Communication Networks (IBCN)
15
Object Properties Department of Information Technology – Broadband Communication Networks (IBCN)
16
Object Property Characteristics
If a property is transitive then its inverse property should also be transitive. Note that if a property is transitive then it cannot be functional. The reason for this is that transitive properties, by their nature, may form ‘chains’ of individuals. Making a transitive property functional would therefore not make sense. Department of Information Technology – Broadband Communication Networks (IBCN)
17
Property Domains & Ranges
It is important to realise that in OWL domains and ranges should not be viewed as constraints to be checked. They are used as ‘axioms’ in reasoning. For example if the property hasTopping has the domain set as Pizza and we then applied the hasTopping property to IceCream (individuals that are members of the class IceCream), this would generally not result in an error. It would be used to infer that the class IceCream must be a subclass of Pizza! An error will only be generated (by a reasoner) if Pizza is disjoint to IceCream It is possible to specify multiple classes as the range for a property. If multiple classes are specified in Prot´eg´e 4 the range of the property is interpreted to be The union of the classes. For example, if the range of a property has the classes Man and Woman listed in the range view, the range of the property will be interpreted as Man union Woman. A This means that individuals that are used ‘on the left hand side’ of the hasTopping property will be inferred to be members of the class Pizza. Any individuals that are used ‘on the right hand side’ of the hasTopping property will be inferred to be members of the class PizzaTopping. For example, if we have individuals a and b and an assertion of the form a hasTopping b then it will be inferred that a is a member of the class Pizza and that b is a member of the class PizzaToppinga. aThis will be the case even if a has not been asserted to be a member of the class Pizza and/or b has not been asserted to be a member of the class PizzaTopping. Department of Information Technology – Broadband Communication Networks (IBCN)
18
Property Restrictions
A restriction describes an anonymous class of individuals based on the relationships that members of the class participate in. 3 main categories: Quantifier Restrictions Existential restrictions Universal restrictions Cardinality Restrictions hasValue Restrictions Department of Information Technology – Broadband Communication Networks (IBCN)
19
Existential Restriction
We have added restrictions to MargeritaPizza to say that a MargheritaPizza is a NamedPizza that has at least one kind of MozzarellaTopping and at least one kind of TomatoTopping. More formally (reading the class description view line by line), if something is a member of the class MargheritaPizza it is necessary for it to be a member of the class NamedPizza and it is necessary for it to be a member of the anonymous class of things that are linked to at least one member of the class MozzarellaTopping via the property hasTopping, and it is necessary for it to be a member of the anonymous class of things that are linked to at least one member of the class TomatoTopping via the property hasTopping. Department of Information Technology – Broadband Communication Networks (IBCN)
20
Reasoning Key Features Classification:
Test whether or not one class is a subclass of another class Consistency checking Check whether or not it is possible for a class to have any instances Department of Information Technology – Broadband Communication Networks (IBCN)
21
Consistency Checking Why did this happen? Intuitively we know something cannot at the same time be both cheese and a vegetable. Something should not be both an instance of CheeseTopping and an instance of VegetableTopping. However, it must be remembered that we have chosen the names for our classes. As far as the reasoner is concerned names have no meaning. The reasoner cannot determine that something is inconsistent based on names. The actual reason that ProbeInconsistentTopping has been detected to be inconsistent is because its superclasses VegetableTopping and CheeseTopping are disjoint from each other — remember that earlier on we specified that the four categories of topping were disjoint from each other. Therefore, individuals that are members of the class CheeseTopping cannot be members of the class VegetableTopping and vice-versa. Department of Information Technology – Broadband Communication Networks (IBCN)
22
Necessary & Sufficient Conditions
Primitive Class Class that only has ‘necessary’ conditions Defined Class Class that has at least one set of ‘necessary and sufficient’ conditions We have converted our description of CheesyPizza into a definition. If something is a CheesyPizza then it is necessary that it is a Pizza and it is also necessary that at least one topping that is a member of the class CheeseTopping. Moreover, if an individual is a member of the class Pizza and it has at least one topping that is a member of the class CheeseTopping then these conditions are sufficient to determine that the individual must be a member of the class CheesyPizza. It is also important to understand that the reasoner can only automatically classify classes under defined classes - i.e. classes with at least one set of necessary and sufficient conditions. Department of Information Technology – Broadband Communication Networks (IBCN)
23
Automated Classification
Computing subclass- superclass relationships vital to keep large ontologies in logically correct state Department of Information Technology – Broadband Communication Networks (IBCN)
24
Universal Restrictions
Constrain the relationships along a given property to individuals that are members of a specific class They don’t specify the existence of a relationship The above universal restriction V hasTopping MozzarellaTopping also describes the individuals that do not participate in any hasTopping relationships. An individual that does not participate in any hasTopping relationships what so ever, by definition does not have any hasTopping relationships to individuals that aren’t members of the class MozzarellaTopping and the restriction is therefore satisfied. Department of Information Technology – Broadband Communication Networks (IBCN)
25
Open World Assumption It cannot be assumed that something does not exist until it is explicitly stated that it does not exist! Closed World Assumption (programming languages, databases, …) Department of Information Technology – Broadband Communication Networks (IBCN)
26
Closure Axiom Department of Information Technology – Broadband Communication Networks (IBCN)
27
Value Partition Restricting the possible values for a property to an exhaustive list Design Pattern Covering Axiom Department of Information Technology – Broadband Communication Networks (IBCN)
28
Value Partition In the final step we created a restriction that had the class expression (PizzaTopping and hasSpiciness some Hot) rather than a named class as its filler. This filler was made up of an intersection between the named class PizzaTopping and the restriction hasSpiciness some Hot. Another way to do this would have been to create a subclass of PizzaTopping called HotPizzaTopping and define it to be a hot topping by having a necessary condition of hasSpiciness some Hot. We could have then used hasTopping some HotPizzaTopping in our definition of SpicyPizza. Although this alternative way is simpler, it is more verbose. OWL allows us to essentially shorten class descriptions and definitions by using class expressions in place of named classes as in the above example. Department of Information Technology – Broadband Communication Networks (IBCN)
29
Cardinality Restrictions
For property P, cardinality restrictions describe the minimum, maximum or exact number of P relationships that an individual can participate in Department of Information Technology – Broadband Communication Networks (IBCN)
30
Qualified Cardinality Restriction
Department of Information Technology – Broadband Communication Networks (IBCN)
31
Datatype Properties Department of Information Technology – Broadband Communication Networks (IBCN)
32
Data Properties Department of Information Technology – Broadband Communication Networks (IBCN)
33
Open World Reasoning bis
The complement of a class includes all of the individuals that are not members of the class. By making NonVegetarianPizza a subclass of Pizza and the complement of VegetarianPizza we have stated that individuals that are Pizzas and are not members of VegetarianPizza must be members of NonVegetarianPizza. Note that we also made VegetarianPizza and NonVegetarianPizza disjoint so that if an individual is a member of VegetarianPizza it cannot be a member of NonVegetarianPizza. Department of Information Technology – Broadband Communication Networks (IBCN)
34
Open World Reasoning bis
As expected (because of Open World Reasoning) UnclosedPizza has not been classified as a VegetarianPizza. The reasoner cannot determine UnclosedPizza is a VegetarianPizza because there is no closure axiom on the hasTopping and the pizza might have other toppings. We therefore might have expected Unclosed- Pizza to be classified as a NonVegetarianPizza since it has not been classified as a VegetarianPizza. However, Open World Reasoning does not dictate that because UnclosedPizza cannot be determined to be a VegetarianPizza it is not a VegetarianPizza — it might be a VegetarianPizza and also it might not be a VegetarianPizza! Hence, UnclosePizza cannot be classified as a NonVegetarian- Pizza. Department of Information Technology – Broadband Communication Networks (IBCN)
35
hasValue Restriction The conditions that we have specified for MozzarellaTopping now say that: individuals that are members of the class MozzarellaTopping are also members of the class CheeseTopping and are related to the individual Italy via the hasCountryOfOrigin property and are related to at least one member of the class Mild via the hasSpiciness property. In more natural English, things that are kinds of mozzarella topping are also kinds of cheese topping and come from Italy and are mildly spicy. Department of Information Technology – Broadband Communication Networks (IBCN)
36
Enumerated Classes This means that an individual that is a member of the Country class must be one of the listed individuals (i.e one of America England France Germany Italy. More formally, the class country is equivalent to (contains the same individuals as) the anonymous class that is defined by the enumeration. Department of Information Technology – Broadband Communication Networks (IBCN)
37
Multiple Sets of Necessary & Sufficient Conditions
Department of Information Technology – Broadband Communication Networks (IBCN)
38
Ontologies – more than just a datamodel … but !
39
Important Consideration
ONTOLOGY ≠ DATA-MODEL ONTOLOGY = DOMAIN-MODEL One global ontology Standardized view of the world Reuse in several apps Applications should adapt to the standardized domain-model
40
Application Three common layers +/- STATIC REUSE DYNAMIC LOGIC
Ontology Rules Application REUSE Of course … common view of the world … different reference views Different perceptions Different rules in the world 2 separate layers in the ontology - static conceptual level - dynamic logical rule-base on top Facilitates application reuse - Logic of the application shifted to the model Example InteGRail! DYNAMIC LOGIC
41
What do you need in an ontology-based application?
Data Sources Legacy Ontology A-Box …. Persistency Relational DB Files Triple Store Reasoning Pellet Fact++ … None Rules Jess Bossam Appl’on Support Jena Sesame Redland SHARED ONTOLOGY MODEL
42
Typical Ontology Service
SPARQL SPARQL SPARQL JOSEKI JENA D2RQ RDF123 TDB SDB Spreadsheet Not too elaborate on D2R en RDF123 … comes in next slides MySQL PopulatorA PopulatorB PopulatorC PopulatorD PopulatorE
43
D2R-Server: Treating Non-RDF Databases as Virtual RDF Graphs
44
RDF123 is an application and web service to generate RDF data from spreadsheets
45
Recent commercial initiatives
Ontology.com Thinking Service Models Metatomix Semantic web-based solutions for Enterprise Resource Interoperability TopQuadrant Making Information Work for the Enterprise Semantic Discovery Systems Beyond Business Intelligence, from Analytics to Discovery Oracle 11g Open, scalable, secure and reliable RDF management
46
Every feature at a certain cost
Genericness Domain Modeling Application Reuse Performance SWRL Rules First-Order Logic Concepts A-Box Size The combination of logic and A-Box drastically reduces the overall performance You could think partitioning is the solution …. Not really … Reasoning would ideally take the whole world into consideration Chance of losing inherent knowledge while partitioning
47
Questions ? Presenters Femke.Ongenae@intec.ugent.be
Some slides and graphs borrowed from the presentation “Making sense of the Semantic Web” by Nova Spivack Presenters INTEC Broadband Communication Networks (IBCN) Department of Information Technology (INTEC) Ghent University - IBBT
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.