Copyright Antidot™ 1 Linked Enterprise Data LEVERAGING THE SEMANTIC WEB STACK IN A CORPORATE ENVIRONMENT ISWC 2012 – BOSTON FABRICE LACROIX –

Copyright Antidot™ 1 Linked Enterprise Data LEVERAGING THE SEMANTIC WEB STACK IN A CORPORATE ENVIRONMENT ISWC 2012 – BOSTON FABRICE LACROIX – LACROIX@ANTIDOT.NET

Copyright Antidot™ 2 Antidot – who we are French-based Software Vendor  Since 1999 | Paris, Lyon, Aix-en-Provence  Information access | Data management Mission: Provide our customers with innovative customizable solutions that help them create value with their data, and make their employees more aware and efficient.

Copyright Antidot™ 3 Clients Publishing Healthcare EnterprisesE-commerce

Copyright Antidot™ 4 Unstructured documents files, ECM, collaborative spaces intranet, extranet, Web sites e-mails, instant messaging

Copyright Antidot™ 5 Structured data CRM, ERP, directory knowledge bases business applications (production, support)

Copyright Antidot™ 6 IS are bloated 1 practice => 1 need => 1 application => 1 silo Information system is driven by the process Data are numerous, various and scattered

Copyright Antidot™ 7 Solutions or workarounds? BIMDM SOASearch

Copyright Antidot™ 8 Solutions and workarounds Enterprise Search brings little value to users  Document oriented  Does not solve real business problems Google like Verity like

Copyright Antidot™ 9 What we want

Copyright Antidot™ 10 What we want LDAP CRM Production ERP ECM Files Support

Copyright Antidot™ 11 Changing the paradigm Switching from an application view to a data centric way of thinking.

Copyright Antidot™ 12 Bring out the implicit Build the Giant Enterprise Graph

Copyright Antidot™ 13 LED Linked Enterprise Data application of the Semantic Web technologies and Linked Data principles to the enterprise infrastructure

Copyright Antidot™ 14 What works for the Web… Federating silos on the Web http://www.w3.org/People/Ivan/CorePresentations/RDFTutorial/Slides.html#(102)

Copyright Antidot™ 15 …can’t always be used in corporate IS  Legacy apps can’t be "Sparql’ed"  80% un- or semi- structured data don’t fit in the model as such  Defining vocabularies/ontologies for silos is too complex and expensive  Don’t want RDF per se but valuable information  External data is available in XML/JSON through Web Services  Staff trained for RDB, XML, Web apps.  No Risk and stability strategy: SemWeb technology considered as new and immature

Copyright Antidot™ 16 The RDF/storage approach Setting up a global RDF repository does not work either  ITs are afraid by the "RDF everywhere" activists

Copyright Antidot™ 17 Semantic Web technology still is the right solution in corporate environment BUT it is not an aim JUST use it as a means

Copyright Antidot™ 18 Just do it Think of it as a stream paradigm  build new objects using existing data  without interfering with the existing infrastructure  with SemWeb somewhere under the hood

Copyright Antidot™ 19 Enterprise Graph HowTo Construct the graph  generate triples from data  create triples from documents Leverage the graph  enrich  infer Browse the graph  select resources  build objects Trash the graph

Copyright Antidot™ 20 How: extract & normalize Harvest and normalize  as in an ETL  fetch, clean, transform…  normalize records (names, IDs) to prepare the linking step For databases  db2triples : an RDB2RDF implementation by Antidot (open source, W3C validated)

Copyright Antidot™ 21 How: semantize Don’t transform everything in RDF  cherry-pick a subset of interesting fields for each object and create their RDF triples counterpart  interesting == needed for linking or inferring Semantize

Copyright Antidot™ 22 How: semantize Triples generation  Be smart: avoid upfront ontology design, use small vocabularies  Be pragmatic: transform XML tags and field names to predicates  Be agile: only insert what you need. And when you need more, add more. Semantic Web fuels the modeling, linking and information building process

Copyright Antidot™ 24 How: semantize Unstructured documents  Extract metadata and transform them as needed to RDF. ➡ Ex: author => dc:creator  Use of text-mining to extract named entities: people, organizations, products… ➡ generate those entities list using the data sources: directory for employees, CRM for companies and people, ERP for products ➡ create triples like doc_URI quotes entity_URI

Copyright Antidot™ 25 How: semantize Unstructured documents  Compare documents using various and dedicated algorithms ➡ is the same ➡ is included ➡ is similar ➡ is related  Generates new triples ➡ create triples like is_sub_version_of

Copyright Antidot™ 27 How: enrich Enrich the graph  run specific algorithms to generate more links and triples (classifiers, topic detection, …)  insert external data gathered from the LOD or other external datasets or APIs

Copyright Antidot™ 28 How: infer Create new knowledge  add rules according to your needs IF a coworker is quoted in documents THEN the business unit is bound to the documents AND this coworker belongs to a business unit

Copyright Antidot™ 30 How: build Build  select resources corresponding to objects seeds (using Sparql queries)  for each seed, follow links smartly in order to create basic objects Build

Copyright Antidot™ 31 How: build Finalize  decorate the new knowledge objects with data set apart (not loaded in the triplestore)  now we have rich user-actionable objects Build Finalize

Copyright Antidot™ 33 How: expose Make the new information available to users and to the entire IS Enrich Harvest Classify Semantize Normalize Annotate Indexation AFS search engine RDF Triplestore (Linked Data) Relational DB

Copyright Antidot™ 34 Conclusion It works!  The triples we create and the inference rules we add are dictated by the goal / application ➡ usage and value oriented  We benefit from the lazy-flexible-dynamic modeling of RDF-RDFS-OWL ➡ we are agile  What matters is the graph. But the graph is not the triplestore ➡ storage independent

Copyright Antidot™ 35 There’s an app for that Antidot Information Factory  a software solution designed specifically to leverage structured and unstructured data  enable large-scale processing of existing data  automate publishing of enriched or newly created information. Harvest Normalize Semantize Enrich Build Expose

Copyright Antidot™ 36 The Giant Enterprise Graph Now we have a path to let SemWeb enter the enterprise

Copyright Antidot™ 37 THANKS FOR YOUR ATTENTION QUESTIONS? Discuss Understand Learn Exchange www.antidot.net info@antidot.net

Copyright Antidot™ 1 Linked Enterprise Data LEVERAGING THE SEMANTIC WEB STACK IN A CORPORATE ENVIRONMENT ISWC 2012 – BOSTON FABRICE LACROIX –

Similar presentations

Presentation on theme: "Copyright Antidot™ 1 Linked Enterprise Data LEVERAGING THE SEMANTIC WEB STACK IN A CORPORATE ENVIRONMENT ISWC 2012 – BOSTON FABRICE LACROIX –"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Copyright Antidot™ 1 Linked Enterprise Data LEVERAGING THE SEMANTIC WEB STACK IN A CORPORATE ENVIRONMENT ISWC 2012 – BOSTON FABRICE LACROIX –

Similar presentations

Presentation on theme: "Copyright Antidot™ 1 Linked Enterprise Data LEVERAGING THE SEMANTIC WEB STACK IN A CORPORATE ENVIRONMENT ISWC 2012 – BOSTON FABRICE LACROIX –"— Presentation transcript:

Similar presentations

About project

Feedback