Ivan Herman, W3C, W3C Brazil Office Meeting São Paulo, Brazil, 2010-10-15.

Ivan Herman, W3C, W3C Brazil Office Meeting São Paulo, Brazil, 2010-10-15

(21)  You had to consult a large number of sites, all different in style, purpose, possibly language…  You had to mentally integrate all those information to achieve your goals  We all know that, sometimes, this is a long and tedious process!

(22)  All those pages are only tips of respective icebergs: ◦ the real data is hidden in databases, XML files, Excel sheets, … ◦ you only have access to what the Web page designers allow you to see

(23)  Specialized sites (Expedia, TripAdvisor) do a bit more: ◦ they gather and combine data from other sources (usually with the approval of the data owners) ◦ but they still control how you see those sources  But sometimes you want to personalize: access the original data and combine it yourself!

(28)  I have to type the same data again and again…  This is even worse: I feed the icebergs…

(29)  The raw data should be available on the Web ◦ let the community figure out what applications are possible…

(33)  Mashup sites are forced to do very ad-hoc jobs ◦ various data sources expose their data via Web Services, API-s ◦ each with a different API, a different logic, different structure ◦ mashup sites are forced to reinvent the wheel many times because there is no standard way getting to the data!

(34)  The raw data should be available in a standard way on the Web ◦ i.e., using URI-s to access data ◦ dereferencing that data should lead to something useful

(35)  What makes the current (document) Web work? ◦ people create different documents ◦ they give an address to it (ie, a URI) and make it accessible to others on the Web

(37)  Others discover the site and they link to it  The more they link to it, the more important and well known the page becomes ◦ remember, this is what, eg, Google exploits!  This is the “Network effect”: some pages become important, and others begin to rely on it even if the author did not expect it…

(40)  The same network effect works on the raw data ◦ Many people link to the data, use it ◦ Much more (and diverse) applications will be created than the “authors” would even dream of!

(43) Photo credit “nepatterson”, Flickr

(44)  A “Web” where ◦ documents are available for download on the Internet ◦ but there would be no hyperlinks among them  This is certainly not what we want!

(50)  The raw data should be available in a standard way on the Web  There should be links among datasets

(51) Photo credit “kxlly”, Flickr

(52)  On the traditional Web, humans are implicitly taken into account  A Web link has a “context” that a person may use

(55)  A human understands that this is where my office is, ie, the institution’s home page  He/she knows what it means ◦ realizes that it is a research institute in Amsterdam  When handling data, something is missing; machines can’t make sense of the link alone

(56)  New lesson learned: ◦ extra information (“label”) must be added to a link: “this links to my institution, which is a research institute” ◦ this information should be machine readable ◦ this is a characterization (or “classification”) of both the link and its target ◦ in some cases, the classification should allow for some limited “reasoning”

(57)  The raw data should be available in a standard way on the Web  Datasets should be linked  Links, data, sites, should be characterized, classified, etc.  The result is a Web of Data

(60)  It is that simple…  Of course, the devil is in the details ◦ a common data model data has to be provided ◦ the “classification” of the terms can become very complex ◦ but these details are fleshed out by experts as we speak!

(61)  A set of core technologies are in place  Lots of data (billions of relationships) are available in standard format ◦ often referred to as “Linked Open Data Cloud”

(63)  There is a vibrant community of ◦ academics: universities of Southampton, Oxford, Stanford, PUC ◦ small startups: Garlik, Talis, C&P, TopQuandrant, Cambridge Semantics, OpenLink, … ◦ major companies: Oracle, IBM, SAP, … ◦ users of Semantic Web data: Google, Facebook, Yahoo! ◦ publishers of Semantic Web data: New York Times, US Library of Congress, open governmental data (US, UK, France,…)

(64)  Companies, institutions begin to use the technology: ◦ BBC, Vodafone, Siemens, NASA, BestBuy, Tesco, Korean National Archives, Pfizer, Chevron, …  see http://www.w3.org/2001/sw/UseCases  Truth must be said: we still have a way to go ◦ deployment may still be experimental, or on some specific places only

(69)  Help in finding the best drug regimen for a specific case, per patient  Integrate data from various sources (patients, physicians, Pharma, researchers, ontologies, etc)  Data (eg, regulation, drugs) change often, but the tool is much more resistant against change Courtesy of Erick Von Schweber, PharmaSURVEYOR Inc., (SWEO Use Case)(SWEO Use Case)

(70)  Integration of relevant data in Zaragoza  Use rules to provide a proper itinerary Courtesy of Jesús Fernández, Mun. of Zaragoza, and Antonio Campos, CTIC (SWEO Use Case)(SWEO Use Case)

(71)  More an more data should be “published” on the Web ◦ this can lead to the “network effect” on data  New breeds of applications come to the fore ◦ “mashups on steroids” ◦ better representation and usage of community knowledge ◦ new customization possibilities ◦ …

(72)  A huge amount of data (“information”) is available on the Web  Sites struggle with the dual task of: ◦ providing quality data ◦ providing usable and attractive interfaces to access that data

(73) “Raw Data Now!” Tim Berners-Lee, TED Talk, 2009 http://bit.ly/dg7H7Z “Raw Data Now!” Tim Berners-Lee, TED Talk, 2009 http://bit.ly/dg7H7Z  Semantic Web technologies allow a separation of tasks: 1. publish quality, interlinked datasets 2. “mash-up” datasets for a better user experience

(74)  The “network effect” is also valid for data  There are unexpected usages of data that authors may not even have thought of  “Curating”, using, exploiting the data requires a different expertise

(75)  W3C ◦ was one of the initiators of the Semantic Web (Tim Berners-Lee and others) ◦ is the place where Semantic Web Standards are developed and defined ◦ is integral part of the Semantic Web community

(76)  It is done by groups, with W3C members delegating experts  Each group has at least one W3C staff member to help the process and contribute to the technology ◦ there is a formal process that has to be followed ◦ the price to pay…

(78)  The public can comment at specific points in the process  Groups must take all comments into account ◦ the number of comments can be in the hundreds...

(79)  Regular telecons (usually once a week)  Possibly 1-2 face-to-face meetings a year  Lots of email discussions  Editorial work to get everything properly written down  Average life-span: 2-3 years

(81) Thank you for your attention! These slides are also available on the Web: http://www.w3.org/2010/Talks/1015-SaoPaulo-Office-IH/

Ivan Herman, W3C, W3C Brazil Office Meeting São Paulo, Brazil, 2010-10-15.

Similar presentations

Presentation on theme: "Ivan Herman, W3C, W3C Brazil Office Meeting São Paulo, Brazil, 2010-10-15."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Ivan Herman, W3C, W3C Brazil Office Meeting São Paulo, Brazil, 2010-10-15.

Similar presentations

Presentation on theme: "Ivan Herman, W3C, W3C Brazil Office Meeting São Paulo, Brazil, 2010-10-15."— Presentation transcript:

Similar presentations

About project

Feedback