The Semantic Web from ft Frank van Harmelen Creative Commons License: allowed to share & remix, but must attribute & non-commercial
The Semantic Web = a big engineering effort + a set of information structuring principles
People Web Machines HOW?
The Current Web of text and pictures linked web-pages, written by people, written for people, used only by people... Many of these pages already come from data, that is usable by computers! But we can’t link the data.... ? ? ? ? The Future Web of Data ? linked data, usable by computers! useful for people!
P1. Give all things a name
P2. Relations form a graph between things
P3. The names are addresses on the Web x T [ IsOfType ] different owners & locations
P1+P2+P3 = Giant Global Graph
P4. explicit & formal semantics assign types to things assign types to relations organise types in a hierarchy empose constraints on possible interpretations
What’s it like to be a computer on the web?
Examples of “semantics” married-to is male married-to relates males to females is male married-to relates males to females married-to relates 1 male to 1 female = married-to relates 1 male to 1 female = lowerboundupperbound married-to
Did you get anywhere? (1/2) already many billions of facts & rules Encyclopedia Geographic names (millions) names of artists & art works (10.000’s) scientific bibliographies hierarchical dictionaries (UK, FR, NL) hierarchical dictionaries (UK, FR, NL) life-science databases any CD ever recorded (almost) May ‘09 estimate > 4.2 billion triples million interlinks May ‘09 estimate > 4.2 billion triples million interlinks basic facts on every country on the planet common sense rules & facts ( ’s)
It gets bigger every month 25 billion facts & relations…
Real life examples handcrafted – music: CDnow (2410/5), MusicMoz (1073/7)CDnow MusicMoz – biomedical: SNOMED (200k), GO (15k), Emtree(45k+190k Systems biologyGO Systems biology ranging from lightweight – Yahoo, UNSPC, Open directory (400k) to heavyweight (Cyc (300k)) Yahoo ranging from small ( METAR ) to large ( UNSPC ) METAR Did you get anywhere? (2/2)
Did you get anywhere? (3/2) used by media: BBC, Reuters, New York Times, governments retail (10K companies): BestBuy, Sears, Kmart, Volkswagen, Renault IT: IBM, Oracle Search: Bing, Yahoo, Google (before May 2012, after May 2012)before May 2012after May 2012
Any lessons?
heterogeneity is unavoidable
Much heterogeneity is solved socially
knowledge obeys a long-tail distribution
Types
Semantic Web from ft ? It’s not yet very well understood It’s a surprisingly successful engineering effort It’s a handful of principles