Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Bluffers Guide to The Semantic Web Frank van Harmelen CS Department Vrije Universiteit Amsterdam Data wants to be free.

Similar presentations


Presentation on theme: "1 Bluffers Guide to The Semantic Web Frank van Harmelen CS Department Vrije Universiteit Amsterdam Data wants to be free."— Presentation transcript:

1 1 Bluffers Guide to The Semantic Web Frank van Harmelen CS Department Vrije Universiteit Amsterdam Data wants to be free

2 2 Semantics as your saviour?

3 3

4 4 Outline The general idea: a Web of Data What must be done to realise this How far away is this Nex steps, do’s, don’ts

5 5 The Scientist’s Problem Too much unintegrated data: l from a variety of incompatible sources l no standard naming convention l each with a custom browsing and querying mechanism (no common interface) l and poor interaction with other data sources Everybody’s

6 6 What are the Data Sources? Flat Files URLs Proprietary Databases Public Databases Spreadsheets Emails … Data wants to be free Maps

7 7 In which disciplines? Archeology Chemistry Genomics, proteomics,... (bio/life-sciences) Communication science Social history Linguistics Bio-diversity Environmental sciences (climate studies).... libraries (KB), archives (sound&vision) One dataset per sitea new database each month historical datalaymen data international data (for their first time) Geo?

8 8 Outline The general idea: a Web of Data What must be done to realise this How far away is this Nex steps, do’s, don’ts

9 The Current Web of text and pictures                     linked web-pages, written by people, written for people, used only by people... Many of these pages already come from data, that is usable by computers! But we can’t link the data.... ? ? ? ? The Future Web of Data ? linked data, usable by computers! useful for people! Data wants to be free

10 10 Which Semantic Web? Version 1: “Enrichment of the current Web” recipe: Annotate and classify web-content enable better search & browse,..

11 11 Which Semantic Web? Version 2: "Semantic Web as Web of Data" (TBL) recipe: expose databases on the web, use RDF, integrate meta-data from: l expressing DB schema semantics in machine interpretable ways enable integration and unexpected re-use

12 12 Outline The general idea: a Web of Data What must be done to realise this How far away is this Nex steps, do’s, don’ts

13 13 machine accessible meaning (What it’s like to be a machine) symptoms drug administration disease IS-A alleviates META-DATA

14 14 What is meta-data? it's just data it's data describing other data its' meant for machine consumption disease name symptoms drug administration

15 15 Required are: 1. a standard syntax l so meta-data can be recognised as such 2.one or more shared vocabularies l so data producers and data consumers all speak the same language 3. lots of resources with meta-data attached mechanisms for attribution and trust

16 1. A standard syntax things & relations between things Semantic Web data model: RDF

17 17 RDF Triples in Life Sciences

18 18 RDF Triples in Geo 55.701 12.552 55.701 12.552 geo:point:_ 55.701 12.552 geo:lat geo:long Remember: RDF = simple model for data Remember: RDF = simple model for data

19 19 RDF Schema: vocabulary for data types Classes + subclass hierarchy rivers are waterways Properties + subproperty hierarchy father-of implies parent-of Domain of properties X capital-of Y  X has-type city Range of properties X capital-of Y  Y has-type country Simple standardised inferences

20 20 OWL: richer vocabulary for data types Things RDF Schema cannot express: Description Logic SHOIN(D) l equality, disjunction, negation, l min/max number restrictions l inverse, symmetric, transitive properties l and much more… Example: Every country has precisely one capital: Inference TheHague ≠ A’dam & A’dam = capital  TheHague ≠ capital Integrity checks after data-merging Example: Every country has precisely one capital: Inference TheHague ≠ A’dam & A’dam = capital  TheHague ≠ capital Integrity checks after data-merging Complex standardised inferences OWL

21 Web of Data: a nybody can say anything about anything All identifiers are URL's (= on the Web) l Allows total decoupling of data vocabulary meta-data x T [ IsOfType ] different owners & locations Data wants to be free

22 22 2. Shared vocabularies Mesh l Medical Subject Headings, National Library of Medicine l 22.000 descriptions EMTREE l Commercial Elsevier, Drugs and diseases l 45.000 terms, 190.000 synonyms UMLS l Integrates 100 different vocabularies SNOMED l 200.000 concepts, College of American Pathologists Gene Ontology l 15.000 terms in molecular biology NCBI Cancer Ontology: l 17,000 classes (about 1M definitions) BioMed Geo?

23 23 Outline The general idea: a Web of Data What must be done to realise this How far away is this Nex steps, do’s, don’ts

24 24 How far away is this ? Stable data formats & standardised inferences Lots of shared vocabularies (+ ways to convert them) Lots of data sources (+ ways to convert them) Lots of tools l convert, construct, edit (data, vocabularies) l store, search, query, reason l interlink l visualise l...

25 already many billions of facts & rules How far away is this ? Not very far away! rapidly growing Linked Open Data cloud. Encyclopedia Geographic names (millions) names of artists & art works (10.000’s) scientific bibliographies hierarchical dictionaries (UK, FR, NL) hierarchical dictionaries (UK, FR, NL) life-science databases any CD ever recorded (almost) every book sold by Amazon basic facts on every country on the planet common sense rules & facts (100.000’s) It gets bigger every month

26 26 Example use-case: bbc.co.uk/music/artists Content is BBC + LOD Use an ontology as basis for the site Serve data back out as RDF “The Web is becoming our content management platform”

27 27 Outline The general idea: a Web of Data What must be done to realise this How far away is this Nex steps, do’s, don’ts

28 28 Next steps 1.hunt for shared vocabularies l try to avoid building them 2.wrap legacy data sources l your own l from others 3.link wrapped sources 4.publish linked data on the web l make noise 5.reconstruct some old results 6.produce new results 7.get famous Can you get famous by sharing data? Can you get famous by sharing data? papers in oncology, in communication science, dedicated conferences in chemistry, earth-sciences, life- sciences, humanities funding opportunities in humanities, social sciences, life sciences learn / get access to some basic technology in-use systems in communication science, KB, Beeld & Geluid, Europeana A little semantics goes a long way

29 29 Questions & discussion Frank.van.Harmelen@cs.vu.nl http://www.cs.vu.nl/~frankh/popularising.html


Download ppt "1 Bluffers Guide to The Semantic Web Frank van Harmelen CS Department Vrije Universiteit Amsterdam Data wants to be free."

Similar presentations


Ads by Google