Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Semantic Web: Vaporware or Worthy Dream? Slides adapted from Nick Kushmerick Rose colored glasses are never made in bi-focals because no-body wants to.

Similar presentations


Presentation on theme: "1 Semantic Web: Vaporware or Worthy Dream? Slides adapted from Nick Kushmerick Rose colored glasses are never made in bi-focals because no-body wants to."— Presentation transcript:

1 1 Semantic Web: Vaporware or Worthy Dream? Slides adapted from Nick Kushmerick Rose colored glasses are never made in bi-focals because no-body wants to read smallprint in dreams

2 2 SemanticWeb “ The Semantic Web ” (Berners-Lee, Hendler, Lassila; Scientific American, May 2001) “W3C Semantic Web Activity” (Koivenen, Miller; Dec 2001) COMP-4016 ~ Computer Science Department ~ University College Dublin ~ www.cs.ucd.ie/staff/nick ~ © Nicholas Kushmerick 2002

3 3 Today’s Syntactic Web HTML (modest compliance with standards thanks to robust browsers) Hyperlinks (no data types; just annnotated with text [sometimes merely “Click here”!]; often dangling references) Human eyeballs & common sense (Just barely?!) suitable/scalable for –“trivial pursuit” information retrieval What’s the capital of Botswana? Will it rain tomorrow? … –“mundane” transactions/services Buying a book; Playing a game of chess; …

4 4 Automating people out of the loop “The bane of my existence is doing things that I know the computer could do for me” (D Connolly; W3C guru) 150 years ago, the telephone was outrageously sophisticated: “Do you seriously predict that every room in every building will have a small device that you type a few numbers into and you can talk to the person in any other room of any other building in the entire world??!!” 30 years ago, email was outrageously sophisticated: “Do you seriously predict that every person will have a small device that you can type a person’s name into and you can send a private message to any other person in the world, that they can read even on the beach in Tahiti?!!?!” 15 years ago, the Web was outrageously sophisticated: “Do you seriously predict that every person will have a device with which they can send their grocery list to the shop and in a few hours the groceries arrive!!??!!” Why can’t my online calendar & bank account negotiate with my garage’s to arrange a mutually convenient time & price to repair my leaking tyre?

5 5 The Challenge All the relvant data is (or soon will be) “on the Web” -- but in a form specialized to human vision/processing, not automated machine processing How can my agent find/parse/extract garage’s free times? Which of my appointments are critical/flexible? Even if I annotated entries, what if the garage’s timetable doesn’t have such a concept? And there are dozens of constraints: How long will it take to get to the garage? Would I pay extra if they have can collect the car? Can they repair the door lock too?

6 6 What/where is the Semantic Web? Layered on top of existing Web. (Just like HTTP is built on top of TCP, which is on top of IP, which is on top of the data-link layer) TCP IP Data-Link research / vapourware solid implementations

7 7 12/9

8 8 Final Agenda… 12:15-12:16 Pointer about Online EvaluationsOnline Evaluations 12:16-12:17 Final Exam related qns 12:17-12:45 Semantic Web 12:45-1pm Buffer (Bridge Topics…) 1-1:25pm Interactive Semester Review 1:25-1:30pm Adieu

9 9 Final Exam Mostly about the second half of the semester –Although I reserve the right to ask questions that combine the two halves Will be take-home Will (likely be) put up on Thursday You will have about 3 full days to work on the exam –There will be escape clauses for folks who have two exams crammed in those three days Exam has to be submitted in HARD COPY.

10 10 Layer 1: URI Everything is a “Resource” (people, books, the attribute “title” of an Amazon “book” object, Web pages, the concept “laziness”, …) Ever resource has a unique identifier -- Uniform Resource Identifier eg, the URI of a Web Page is its URL Eg, the URI of my email address is mailto:nick@ucd.ie Owner of object can pick any URI they want as long is it is unique. Often has “URL-like” syntax but that is purely convention/arbitrary

11 11 Layer 2: XML Use XML as common formatting standard for encoding data. (Could invent a new format for every kind of data but why bother?) War & Peace … book title … title name … Data Meta-Data Meta- Meta-Data Danger/Warning: Made-up syntax!!

12 12 XML Schema An XML Schema document is an XML document that defines a set of XML tags (and how they may be used)

13 13 XML Namespaces An XML documents may use tags defined in more than one XML Schema document “Namespace” prefixes (xxx:yyy) are used to unambiguously point to the defining XML Schema document Nick’s Home Page

14 14 Layer 3: RDF All data/knowledge/facts/opinions/information is expressed on the Semantic Web as “Resource Description Framework” statements Very simple language for making assertions: –Triple: (value) (attribute) (object) –(nick@ucd.ie) (is email address of) (Nick Kushmerick) –(0140444173) (is ISBN number of) (War & Peace) –(field 5 of database A) (is a field of type) (postal code)

15 15 Everything is XML Remember (Nick’s Home Page) (is title of) (http://www.cs.ucd.ie/staff/nick) is actually encoded as some very ugly XML: Nick’s Home Page

16 16 Layer 4: Ontologies (RDF Schema) There are lots of common RDF attribute-sets for lots of common tasks eg -- “Dublin Core” [Ohio, Sorry!] defines a few dozen standard attributes for asserting statements about documents: title, author, date, version, format, owner, … But suppose you want to define your own concepts/attributes -- –RDF Schema = set of RDF tags for defining a new set of RDF tags (no, this isn’t circular)

17 17 RDF Schema for Dublin Core Ontology

18 18 Layer 4½: Mapping Between Ontologies Taxonomy Crisis: –How can your agent know that my “title” is your “name”?! –How can my agent know that some of your “address” objects are post-boxes, not physical addresses?! –How can my agent know that many Asian first names correspond to Western surnames? Semantic Web Solution: Services for translating/mapping between “related” ontologies. Suppose Amazon.com uses Dublin Core (“title”), while Fred Hanna uses it’s own document ontology (“name”). So far … my agent is forced to choose a ontology, or must be carefully crafted to understand both lanuages A better solution: A niche now exists for a independent entity (UniversalBookInfo.com) that maps “title”  “name” etc

19 19 without UniversalBookInfo.com Nick wants to buy War & Peace Nick’s very complicated agent Amazon Fred Hanna Amazon ontology FredHanna ontology Programmer’s bank account €€€

20 20 with UniversalBookInfo.com Nick wants to buy War & Peace Nick’s agent Amazon Universal BookInfo.com Fred Hanna Joe’s agent Jane’s Agent Bank Account €€€ €€ €

21 21 Layer 5: Logic Ontologies also allow axioms –“All people have brains” Expressiveness: Key challenge in formalizing axioms: want to be able to say anything you need to in a particular domain. –“All people have brains, except George Bush.” But more expressive logics mean slower inference –Intuitively, applying a rule such as “You can’t fool all of the people all of the time” could require checking everyone in the universe to determine if there exists even one foolable person. DAML + OIL

22 22 Integrating Services Source can be “services” rather than “data repositories” –Eg. Amazon as a composite service for book buying –Separating line is somewhat thin Handling services –Description (API;I/O spec) WSDL –Composition Planning in general –Execution Data-flow architectures –See next part

23 23 Who will annotate the data? Semantic web works if the users annotate their pages using some existing ontology (or their own ontology, but with mapping to other ontologies) –But users typically do not conform to standards.. and are not patient enough for delayed gratification… –The way to force them is to act as if you are helping them write web-pages Currently most people don’t write their HTML code—the MS frontpages and Claris Homepages of the world do.. What if we change the MS Frontpage/Claris Homepage so that they (slyly) add annotations? E.g. The Mangrove project at U. Wash.Mangrove –Help user in tagging their data (allow graphical editing) –Provide instant gratification by running services that use the tags.

24 24 Layer 6: Proofs ugly XML encoding Proof Verifier Yes this proof is correct No this proof is flawed (Easy to build once the Logic layer is fixed)

25 25 Proofs: Huh?!??! ugly XML encoding Proof Verifier Yes this proof is correct No this proof is flawed (Easy to build once the Logic layer is fixed) I am an employee of XYZ Corp (because it says so on this Web page, which is an XYZ Corp official document) I would like to buy this book; please send my company an invoice OK, book successfully ordered Sorry, we need a credit card!

26 26 Proofs  Trust In the Semantic Web, a “proof” is a procedure that can be automatically followed in order to verify an assertion. Believability is always relative to a set of resources that you trust –I own bank account #239489248234, because my Digital Signature XXXX matches the record on Web page http://bank.com/accounts, and you trust this page because you own bank.com

27 27 Summary Distributed global information ecosystem enables wide variety of value-added information services (monitoring your online purchases; finding entertainment in which you might be interested; scheduling appointments; …) But doing so is difficult/impossible if relevant data is tied up in legacy documents intended for human eyes/common sense The Semantic Web as Global Database/Brain for All Humanity -- Probably hopelessly futile But within sufficiently motivated (ie, rich) segments of the Web today’s Syntactic Web may well evolve into A Semantic Web Rose colored glasses are never made in bi-focals because no-body wants to read smallprint in dreams

28 28 Final Agenda… 12:15-12:16 Pointer about Online EvaluationsOnline Evaluations 12:16-12:17 Final Exam related qns 12:17-12:45 Semantic Web 12:45-1pm Buffer (Bridge Topics…) 1-1:25pm Interactive Semester Review 1:25-1:30pm Adieu

29 29 Soft Joins..WHIRL [Cohen] Often you have database relations some of whose fields are “Textual” –E.g. a movie database, which has, in addition to year, director etc., a column called “Review” which is unstructured text Normal DB operations ignore this unstructured stuff (can’t join over them). –SQL sometimes supports “Contains” constraint (e.g. give me movies that contain “Rotten” in the review We can extend the notion of Joins to “Similarity Joins” where similarity is measured in terms of vector similarity over the text attributes. So, the join tuples are output n a ranked form—with the rank proportional to the similarity –Neat idea… but does have some implementation difficulties –Most tuples in the cross-product will have non-zero similarities. So, need query processing that will somehow just produce highly ranked tuples

30 30 BANKSBANKS: Keyword Search in DB

31 31 Final Agenda… 12:15-12:16 Pointer about Online EvaluationsOnline Evaluations 12:16-12:17 Final Exam related qns 12:17-12:45 Semantic Web 12:45-1pm Buffer (Bridge Topics…) 1-1:25pm Interactive Semester Review 1:25-1:30pm Adieu

32 32 Course Outcomes After this course, you should be able to answer: –How search engines work and why are some better than others –Can web be seen as a collection of (semi)structured databases? If so, can we adapt database technology to Web? –Can useful patterns be mined from the pages/data of the web? What did you think these were going to be?? REVIEW

33 33 Main Topics Approximately three halves plus a bit: –Information retrieval –Information integration/Aggregation –Information mining –other topics as permitted by time REVIEW

34 34 Adapting old disciplines for Web-age Information (text) retrieval –Scale of the web –Hyper text/ Link structure –Authority/hub computations Databases –Multiple databases Heterogeneous, access limited, partially overlapping –Network (un)reliability Datamining [Machine Learning/Statistics/Databases] –Learning patterns from large scale data REVIEW

35 35 A Farside treasury…

36 36 Final Agenda… 12:15-12:16 Pointer about Online EvaluationsOnline Evaluations 12:16-12:17 Final Exam related qns 12:17-12:45 Semantic Web 12:45-1pm Buffer (Bridge Topics…) 1-1:25pm Interactive Semester Review 1:25-1:30pm Adieu

37 37 Rao: I could've taught more...I could've taught more, if I'd just...I could've taught more... Sree: Rao, there are twenty people who are mad at you because you taught too much. Look at them. Rao: If I'd made more time...I wasted so much time, you have no idea. If I'd just... Sree: There will be generations (of bitter people) because of what you did. Rao: I didn't do enough. Sree: You did so much. Rao: This slide. We could’ve removed this slide. Why did I keep the slide? Two minutes, right there. Two minutes, two more minutes.. This music, a bit on p2p. This review. Two points on custom portals. I could easily have made two for it. At least one. I could’ve gotten one more point across. One more. One more point. A point, Sree. For this. I could've gotten one more point across and I didn't.  Adieu with an Oscar Schindler Routine.. Schindler: I could've got more...I could've got more, if I'd just...I could've got more... Stern: Oskar, there are eleven hundred people who are alive because of you. Look at them. Schindler: If I'd made more money...I threw away so much money, you have no idea. If I'd just... Stern: There will be generations because of what you did. Schindler: I didn't do enough. Stern: You did so much. Schindler: This car. Goeth would've bought this car. Why did I keep the car? Ten people, right there. Ten people, ten more people...(He rips the swastika pin from his lapel) This pin, two people. This is gold. Two more people. He would've given me two for it. At least one. He would've given me one. One more. One more person. A person, Stern. For this. I could've gotten one more person and I didn't. Top few things I would have done if I had more time More on dimensionality reduction Extracting tagged info from text pages More on structure—non-structure integration Soft joins[Cohen]; Keyword search on DB Customized portal generation Optimization issues in Data Integration P2P mediation Services—and service standards Be less demanding more often (or even once…)

38 38


Download ppt "1 Semantic Web: Vaporware or Worthy Dream? Slides adapted from Nick Kushmerick Rose colored glasses are never made in bi-focals because no-body wants to."

Similar presentations


Ads by Google