Presentation is loading. Please wait.

Presentation is loading. Please wait.

Frank van Harmelen Vrije Universiteit Amsterdam The Information Universe of the (Near) Futur e Creative Commons License: allowed to share & remix, but.

Similar presentations

Presentation on theme: "Frank van Harmelen Vrije Universiteit Amsterdam The Information Universe of the (Near) Futur e Creative Commons License: allowed to share & remix, but."— Presentation transcript:

1 Frank van Harmelen Vrije Universiteit Amsterdam The Information Universe of the (Near) Futur e Creative Commons License: allowed to share & remix, but must attribute & non-commercial

2 Frank van Harmelen Vrije Universiteit Amsterdam The Information Universe of the (Near) Futur e Creative Commons License: allowed to share & remix, but must attribute & non-commercial What it will look like Why it needs infinite scalability and how to achieve this with the Large Knowledge Collider

3 The Current Information Universe linked web-pages, written by people, written for people, used only by people... Many of these pages already come from data, usable by computers! But we cant link the data.... ? ? ? ? The Future Information Universe ? linked data, usable by computers! useful for people!

4 already many billions of facts & rules How far away is this ? Not very far away! rapidly growing Linked Open Data cloud. Encyclopedia Geographic names (millions) names of artists & art works (10.000s) scientific bibliographies hierarchical dictionaries (UK, FR, NL) hierarchical dictionaries (UK, FR, NL) life-science databases any CD ever recorded (almost) every book sold by Amazon basic facts on every country on the planet common sense rules & facts (100.000s) It gets bigger every month

5 Full Web-style decoupling: re-usability, independence All identifiers are URL's (= on the Web) –Allows total decoupling of data vocabulary meta-data x T [ IsOfType ] different owners & locations

6 For the first time ever, it is now possible: to re-use somebody else's knowledge base without having to talk to them first (syntax, semantics) without having to make copies Rapid growth: "billion triple challenge" (= machine-reason with a billion facts and rules) 2006: where do we get a billion facts from? 2008: which billion shall we choose!

7 What to do when success is becoming a problem? The Large Knowledge Collider a platform for infinitely scalable reasoning on the data-web

8 Infinite scalability? parallelisation cluster computing distribution Thinking@home, self-computing semantic Web approximation almost is often good enough gets better with more resources

9 First result: MaRVIN MaRVIN scales by: distribution (over many nodes) approximation (sound but incomplete) anytime convergence (more complete over time) brain the size of a planet

10 The consortium 14 partners, 50 people

11 The project 10M budget 3.5 years 80 person years 3 case studies 14 partners

12 Use case: Drug Discovery Problem: pharmaceutical R&D in early clinical development is stagnating (Q 1 Q 2 Q 3 ) FDA white paper Innovation or Stagnation (March 2004): developers have no choice but to use the tools of the last century to assess this century's candidate solutions. industry scientists often lack cross-cutting information about an entire product area, or information about techniques that may be used in areas other than theirs FDA white paper Innovation or Stagnation (March 2004): developers have no choice but to use the tools of the last century to assess this century's candidate solutions. industry scientists often lack cross-cutting information about an entire product area, or information about techniques that may be used in areas other than theirs Show me any potential liver toxicity associated with the compounds drug class, target, structure and disease. Show me all liver toxicity associated with the target or the pathway. Genetics Show me all liver toxicity associated with compounds with similar structure Chemistry Show me all liver toxicity from the public literature and internal reports that are related to the drug class, disease and patient population LITERATURE Current NCBI: linking but no inference

13 Where is the traffic moving Is public transportation where people are Which location attracts most people right now Is public transportation where people will be Where is the traffic moving Is public transportation where people are Which location attracts most people right now Is public transportation where people will be Use Case: City on-line Our cities face many challenges Urban Computing is the ICT way to address them Is public transportation where the people are? Which landmarks attract more people? Where are people concentrating? Where is traffic moving? improve the quality of life

14 Is anybody doing this for real? OpenCalais: –enrich text (news items) with semantic meta-data –recognise people, places, events, organisations,... –useful for searching, selecting, personalising, aggregating, summarising, etc From early 09: –identify people, places, events, organisations,... by linking to the Open Data cloud:

15 Summarising The Information Universe of the Future will be a Web of Data This Web of Data is rapidly taking shape There are compelling use-cases Industrial take-up is beginning to happen We are building new infrastructure to deal with required scale

16 Contact Info Want to ask questions? Want to play with LarKC? Want to contribute plugins? Want to run a use-case? Want to ask questions? Want to play with LarKC? Want to contribute plugins? Want to run a use-case?

Download ppt "Frank van Harmelen Vrije Universiteit Amsterdam The Information Universe of the (Near) Futur e Creative Commons License: allowed to share & remix, but."

Similar presentations

Ads by Google