Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright Antidot™ 1 Linked Enterprise Data LEVERAGING THE SEMANTIC WEB STACK IN A CORPORATE ENVIRONMENT ISWC 2012 – BOSTON FABRICE LACROIX –

Similar presentations


Presentation on theme: "Copyright Antidot™ 1 Linked Enterprise Data LEVERAGING THE SEMANTIC WEB STACK IN A CORPORATE ENVIRONMENT ISWC 2012 – BOSTON FABRICE LACROIX –"— Presentation transcript:

1 Copyright Antidot™ 1 Linked Enterprise Data LEVERAGING THE SEMANTIC WEB STACK IN A CORPORATE ENVIRONMENT ISWC 2012 – BOSTON FABRICE LACROIX – LACROIX@ANTIDOT.NET

2 Copyright Antidot™ 2 Antidot – who we are French-based Software Vendor  Since 1999 | Paris, Lyon, Aix-en-Provence  Information access | Data management Mission: Provide our customers with innovative customizable solutions that help them create value with their data, and make their employees more aware and efficient.

3 Copyright Antidot™ 3 Clients Publishing Healthcare EnterprisesE-commerce

4 Copyright Antidot™ 4 Unstructured documents files, ECM, collaborative spaces intranet, extranet, Web sites e-mails, instant messaging

5 Copyright Antidot™ 5 Structured data CRM, ERP, directory knowledge bases business applications (production, support)

6 Copyright Antidot™ 6 IS are bloated 1 practice => 1 need => 1 application => 1 silo Information system is driven by the process Data are numerous, various and scattered

7 Copyright Antidot™ 7 Solutions or workarounds? BIMDM SOASearch

8 Copyright Antidot™ 8 Solutions and workarounds Enterprise Search brings little value to users  Document oriented  Does not solve real business problems Google like Verity like

9 Copyright Antidot™ 9 What we want

10 Copyright Antidot™ 10 What we want LDAP CRM Production ERP ECM Files Support

11 Copyright Antidot™ 11 Changing the paradigm Switching from an application view to a data centric way of thinking.

12 Copyright Antidot™ 12 Bring out the implicit Build the Giant Enterprise Graph

13 Copyright Antidot™ 13 LED Linked Enterprise Data application of the Semantic Web technologies and Linked Data principles to the enterprise infrastructure

14 Copyright Antidot™ 14 What works for the Web… Federating silos on the Web http://www.w3.org/People/Ivan/CorePresentations/RDFTutorial/Slides.html#(102)

15 Copyright Antidot™ 15 …can’t always be used in corporate IS  Legacy apps can’t be "Sparql’ed"  80% un- or semi- structured data don’t fit in the model as such  Defining vocabularies/ontologies for silos is too complex and expensive  Don’t want RDF per se but valuable information  External data is available in XML/JSON through Web Services  Staff trained for RDB, XML, Web apps.  No Risk and stability strategy: SemWeb technology considered as new and immature

16 Copyright Antidot™ 16 The RDF/storage approach Setting up a global RDF repository does not work either  ITs are afraid by the "RDF everywhere" activists

17 Copyright Antidot™ 17 Semantic Web technology still is the right solution in corporate environment BUT it is not an aim JUST use it as a means

18 Copyright Antidot™ 18 Just do it Think of it as a stream paradigm  build new objects using existing data  without interfering with the existing infrastructure  with SemWeb somewhere under the hood

19 Copyright Antidot™ 19 Enterprise Graph HowTo Construct the graph  generate triples from data  create triples from documents Leverage the graph  enrich  infer Browse the graph  select resources  build objects Trash the graph

20 Copyright Antidot™ 20 How: extract & normalize Harvest and normalize  as in an ETL  fetch, clean, transform…  normalize records (names, IDs) to prepare the linking step For databases  db2triples : an RDB2RDF implementation by Antidot (open source, W3C validated)

21 Copyright Antidot™ 21 How: semantize Don’t transform everything in RDF  cherry-pick a subset of interesting fields for each object and create their RDF triples counterpart  interesting == needed for linking or inferring Semantize

22 Copyright Antidot™ 22 How: semantize Triples generation  Be smart: avoid upfront ontology design, use small vocabularies  Be pragmatic: transform XML tags and field names to predicates  Be agile: only insert what you need. And when you need more, add more. Semantic Web fuels the modeling, linking and information building process

23 Copyright Antidot™ 23 Enterprise Graph HowTo Construct the graph  generate triples from data  create triples from documents Leverage the graph  enrich  infer Browse the graph  select resources  build objects Trash the graph

24 Copyright Antidot™ 24 How: semantize Unstructured documents  Extract metadata and transform them as needed to RDF. ➡ Ex: author => dc:creator  Use of text-mining to extract named entities: people, organizations, products… ➡ generate those entities list using the data sources: directory for employees, CRM for companies and people, ERP for products ➡ create triples like doc_URI quotes entity_URI

25 Copyright Antidot™ 25 How: semantize Unstructured documents  Compare documents using various and dedicated algorithms ➡ is the same ➡ is included ➡ is similar ➡ is related  Generates new triples ➡ create triples like is_sub_version_of

26 Copyright Antidot™ 26 Enterprise Graph HowTo Construct the graph  generate triples from data  create triples from documents Leverage the graph  enrich  infer Browse the graph  select resources  build objects Trash the graph

27 Copyright Antidot™ 27 How: enrich Enrich the graph  run specific algorithms to generate more links and triples (classifiers, topic detection, …)  insert external data gathered from the LOD or other external datasets or APIs

28 Copyright Antidot™ 28 How: infer Create new knowledge  add rules according to your needs IF a coworker is quoted in documents THEN the business unit is bound to the documents AND this coworker belongs to a business unit

29 Copyright Antidot™ 29 Enterprise Graph HowTo Construct the graph  generate triples from data  create triples from documents Leverage the graph  enrich  infer Browse the graph  select resources  build objects Trash the graph

30 Copyright Antidot™ 30 How: build Build  select resources corresponding to objects seeds (using Sparql queries)  for each seed, follow links smartly in order to create basic objects Build

31 Copyright Antidot™ 31 How: build Finalize  decorate the new knowledge objects with data set apart (not loaded in the triplestore)  now we have rich user-actionable objects Build Finalize

32 Copyright Antidot™ 32 Enterprise Graph HowTo Construct the graph  generate triples from data  create triples from documents Leverage the graph  enrich  infer Browse the graph  select resources  build objects Trash the graph

33 Copyright Antidot™ 33 How: expose Make the new information available to users and to the entire IS Enrich Harvest Classify Semantize Normalize Annotate Indexation AFS search engine RDF Triplestore (Linked Data) Relational DB

34 Copyright Antidot™ 34 Conclusion It works!  The triples we create and the inference rules we add are dictated by the goal / application ➡ usage and value oriented  We benefit from the lazy-flexible-dynamic modeling of RDF-RDFS-OWL ➡ we are agile  What matters is the graph. But the graph is not the triplestore ➡ storage independent

35 Copyright Antidot™ 35 There’s an app for that Antidot Information Factory  a software solution designed specifically to leverage structured and unstructured data  enable large-scale processing of existing data  automate publishing of enriched or newly created information. Harvest Normalize Semantize Enrich Build Expose

36 Copyright Antidot™ 36 The Giant Enterprise Graph Now we have a path to let SemWeb enter the enterprise

37 Copyright Antidot™ 37 THANKS FOR YOUR ATTENTION QUESTIONS? Discuss Understand Learn Exchange www.antidot.net info@antidot.net


Download ppt "Copyright Antidot™ 1 Linked Enterprise Data LEVERAGING THE SEMANTIC WEB STACK IN A CORPORATE ENVIRONMENT ISWC 2012 – BOSTON FABRICE LACROIX –"

Similar presentations


Ads by Google