Semantic Web research anno 2006: main streams, popular falacies, current status, future challenges Frank van Harmelen Vrije Universiteit Amsterdam.

Semantic Web research anno 2006: main streams, popular falacies, current status, future challenges Frank van Harmelen Vrije Universiteit Amsterdam

2 This is NOT a Semantic Web evangelization talk (I assume you are already converted)

This is a “topical” talk: Webster: “ referring to the topics of the day, of temporary interest”

Which Semantic Web are we talking about? Semantic Web research anno 2006: main streams, popular falacies, current status, future challenges main streams

5 General idea of Semantic Web Make current web more machine accessible (currently all the intelligence is in the user) Motivating use-cases Search engines concepts, not keywords semantic narrowing/widening of queries Shopbots semantic interchange, not screenscraping E-commerce l Negotiation, catalogue mapping, data-integration Web Services l Need semantic characterisations to find them Navigation by semantic proximity, not hardwired links.....

6 General idea of Semantic Web(2) Do this by: 1.Making data and meta-data available on the Web in machine-understandable form (formalised) 2.Structure the data and meta-data in ontologies These are non-trivial design decisions. Alternative would be:

7 “ machine-understandable form” (What it’s like to be a machine) symptoms drug administration disease IS-A alleviates META-DATA

8 Expressed using the W3C stack

9 Which Semantic Web? Version 1: "Semantic Web as Web of Data" (TBL) recipe: expose databases on the web, use RDF, integrate meta-data from: l expressing DB schema semantics in machine interpretable ways enable integration and unexpected re-use

10 Which Semantic Web? Version 2: “Enrichment of the current Web” recipe: Annotate, classify, index meta-data from: l automatically producing markup: named-entity recognition, concept extraction, tagging, etc. enable personalisation, search, browse,..

11 Which Semantic Web? Version 1: “Semantic Web as Web of Data” Version 2: “Enrichment of the current Web” Different use-cases Different techniques Different users

Four popular falacies about the Semantic Web Semantic Web research anno 2006: main streams, popular falacies, current status, future challenges popular falacies

13 First: clear up some popular misunderstandings False statement No  : “Semantic Web people try to enforce meaning from the top” They only “enforce” a language. They don’t enforce what is said in that language Compare: HTML “enforced” from the top, But content is entirely free.

14 First: clear up some popular misunderstandings False statement No  : “The Semantic Web people will require everybody to subscribe to a single predefined "meaning" for the terms we use.” Of course, meaning is fluid, contextual, etc. Lot’s of work on (semi)-automatically bridging between different vocabularies.

15 First: clear up some popular misunderstandings False statement No  : “The Semantic Web will require users to understand the complicated details of formalised knowledge representation.” All of this is “under the hood”.

16 First: clear up some popular misunderstandings False statement No  : “The Semantic Web people will require us to manually markup all the existing web-pages.” Lots of work on automatically producing semantic markup: named-entity recognition, concept extraction, etc.

The current state of Semantic Web Semantic Web research anno 2006: main streams, popular falacies, current status, future challenges current status

18 4 hard questions on the Semantic Web: Q1: "where does the meta-data come from?” NL technology is delivering on concept-extraction  Socially emerging (learning from tagging). Q2: “where do the meta-data-schema come from?” many handcrafted schema  hierarchy learning remains hard  relation extraction remains hard. Q3: “what to do with many meta-data schema?”  ontology mapping/aligning remains VERY hard. Q4: “where’s the ‘Web’ in the Semantic Web?” more attention to social aspects (P2P, FOAF)  non-textual media remains hard  deal with typical Web requirements.

19 Q1: Where do the ontologies come from? Professional bodies, scientific communities, companies, publishers, …. Good old fashioned Knowledge Engineering Convert from DB-schema, UML, etc. Learning remains very hard…

20 Q1: Where do the ontologies come from? handcrafted l music: CDnow (2410/5), MusicMoz (1073/7) community efforts l biomedical: SNOMED (200k), GO (15k), commercial: Emtree(45k+190k) ranging from lightweight ( Yahoo ) to heavyweight ( Cyc ) ranging from small ( METAR ) to large ( UNSPC ) METAR

21 Q2: Where do the annotations come from? -Automated learning -shallow natural language analysis -Concept extraction amsterdam trade antwerp europe amsterdam merchant city town center netherlands merchant city town Example: Encyclopedia Britannica on “Amsterdam”

22 lightweight NLP l Dutch language semantic search engine exploit existing legacy-data l Amazon l Lab equipment side-effect from user interaction l MIT Lab photo-annotator NOT from manual effort Q2: Where do the annotations come from?

23 Q3: What to do with many ontologies? Mesh l Medical Subject Headings, National Library of Medicine l 22.000 descriptions EMTREE l Commercial Elsevier, Drugs and diseases l 45.000 terms, 190.000 synonyms UMLS l Integrates 100 different vocabularies SNOMED l 200.000 concepts, College of American Pathologists Gene Ontology l 15.000 terms in molecular biology NCI Cancer Ontology: l 17,000 classes (about 1M definitions),

24 Q3: What to do with many ontologies? Stitching all this together by hand?

25 Q3: What to do with many ontologies? Linguistics & structure Shared vocabulary Instance-based matching Shared background knowledge

26 Where are we now: tools Languages are stable Tooling is rapidly emerging l HP, IBM, Oracle, Adobe, … l Parsers, l Editors, l visualisers, l large scale storage and querying l Portal generation, search

27 Where are we now: applications healthy uptake in some areas: knowledge management / intranets data-integration life-sciences convergence with Semantic Grid cultural heritage  still very few applications in  personalisation  mobility/context awareness  Most applications for companies, few applications for the public

Future directions/challenges Semantic Web research anno 2006: main streams, popular falacies, current status, future challenges future challenges

29 Semantic Web as an integrator of many different subfields Databases Natural Language Processing Knowledge Representation Machine Learning Information Retrieval Agents HCI ….

30 Provocation… Ontology research is done…… l We know how to make, maintain & deploy them l We have tools & methods for editing, storing, inferencing, visualising, etc … except for two problems: l Learning l Mapping Natural lang. technology is also done… l at least it’s good enough

31 Large open questions Ontology learning & mapping emerging semantics (social & statistical) Semantic Web services l discovery, composition: realistic? non-textual media l the semantic gap: text or social? Deployment: 1.data-integration 2.search 3.personalisation

32 Changing focus centralised, formalised, complete, precise distributed, heterogeneous, open, P2P, approximate, lightweight Web 3.0 = Web 2.0 + Semantic Web

33 Web Not much Lots Semantics Lots Not much Artificial Intelligence Collective Intelligence RDF Flexible & extensible Metadata schemas Semantic Web Services Ontology Building OWL Knowledge Discovery SWRL Decision making FOAF RSS Social bookmarking NLP Information linking Slide by Carol Goble Predicting the future…

Semantic Web research anno 2006: main streams, popular falacies, current status, future challenges Frank van Harmelen Vrije Universiteit Amsterdam.

Similar presentations

Presentation on theme: "Semantic Web research anno 2006: main streams, popular falacies, current status, future challenges Frank van Harmelen Vrije Universiteit Amsterdam."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Semantic Web research anno 2006: main streams, popular falacies, current status, future challenges Frank van Harmelen Vrije Universiteit Amsterdam.

Similar presentations

Presentation on theme: "Semantic Web research anno 2006: main streams, popular falacies, current status, future challenges Frank van Harmelen Vrije Universiteit Amsterdam."— Presentation transcript:

Similar presentations

About project

Feedback