Download presentation
Presentation is loading. Please wait.
Published byMyles O’Brien’ Modified over 9 years ago
1
1 ICLP-09 Enabling serendipitous search on the Web of Data using Prolog Jan Wielemaker VU University Amsterdam
2
2 ICLP-09 Issues addressed Recent developments reshaped the Web The web moved from “Web of documents” to “Web of data” and “Web of applications” “Open” and “Linked” data makes massive amounts of data available to be processed by machines How can we deploy Prolog in this environment?
3
3 ICLP-09 Overview Introducing the semantic search engine “ClioPatria”; description of the problem it addresses Why (not) use Prolog for semantic web applications? Processing RDF-data Applying Prolog in web-servers Creating interactive web-applications Wrap-up
4
PART I The ClioPatria use-case: Integrate digital collections of multiple museums and connect it to background knowledge
5
Collection and Meta-data Schema Vocabularies
6
6 Background knowledge
7
7 The Web: documents and links URL Web-link (untyped hyperlink)
8
8 The Semantic, or Data Web: data and links URL Web link Painter “Henri Matisse” Getty ULAN creator Dublin Core Painting “Green Stripe (M me Matisse)” Royal Museum of Fine Arts, Copenhagen
11
… nice graph, but... What about semantics? What about structure?
12
Semantic Web data model: RDF 1 fact = R(O 1, O 2 ) = = 1 “triple” many facts = labelled graph = RDF URIs as identifiers, typed relations between typed objects Has many different syntaxes (XML (W3C), N3, Turtle, graphical, etc). Doesn’t matter: it’s a data model Slide by Frank van Harmelen
13
Semantic Web data model: RDF Schema hierarchy of types, hierarchy of relations, domain/range-constraints simple: no negation, disjunction, universal Slide by Frank van Harmelen
14
Semantic Web data model: OWL and SWRL everything you wanted to say but cannot say in RDF(S) negation, disjunction, cardinality, limited universal, relational algebra (trans, symm) still no composition of relations (DL-based) SWRL: rules with DL concepts as atoms Full DL Lite Slide by Frank van Harmelen
15
15 Structure for thesauri
16
Structure for works of Art
17
From meta-data to semantic meta-data Thesaurus Schema mapping (SKOS) Meta-data Schema mapping (VRA) Thesaurus alignment Meta-data mapping 5 collections → 11,000,000 triples
18
Part of a large cloud of linked data!
19
The challenge How to make use of this network for search? Can we search better? Can we present better?
20
ClioPatria A Prolog web-server with RDF-store Developed to explore this challenge Explore graph using best-first search based on semantic distance Cluster results based on relation to query
21
ClioPatria: “Matisse” “Matisse” in the title “Matisse” in the title Located in “Musee Matisse” Located in “Musee Matisse” Created by “Matisse” Created by “Matisse” Paintings in the same style as used by “Matisse” Paintings in the same style as used by “Matisse”
22
Serendipitous? Serendipity is the effect by which one accidentally discovers something fortunate, especially while looking for something else entirely unrelated (wikipedia). The search is not based on any schema It can find results through unexpected paths It often finds many unintended results (i.e., it answers multiple “graph” queries) This remains manageable due to clustering → “Post-query disambiguation”
23
Serendipitous … “Picasso” Things made from “Picasso marble” Things made from “Picasso marble”
24
ClioPatria fact-sheet Prolog246 files, 67,500 lines Developers3 core, about 10 occasional Triples loaded Used with upto 22,000,000. Scales to 300,000,000 in 64-Gb memory UsageKnown to be in use in 6 projects http://e-culture.multimedian.nl/software/ClioPatria.shtml
25
25 ICLP-09 Part-II Using Prolog for the Semantic Web
26
26 ICLP-09 The neaties vs. the scruffies (DL-)Logic background In search for expressive logics, correct and efficient resolution techniques LP: F-Logic, ASP, ALP, FO(.), … (Marc Denecker) Webby background In search for doing something useful with huge amounts of shallow and inconsistent facts Simple logics, techniques need not be sound, neither complete.
27
27 ICLP-09 Why NOT Prolog? The core-concepts in the Web community are: Networking Concurrency Web-page generation Internationalization ... These are typically not associated to Prolog
28
28 ICLP-09 Why Prolog? RDF fits nicely with relational model of Prolog With a little work it does everything SPARQL can … but it is much more flexible Most languages in the SW-community can be translated into Horn-clauses: OWL (large subset) Rule languages: SWRL, RIF ...
29
29 ICLP-09 The Semantic Web seen from Prolog Pure predicate rdf/3: rdf(?Subject, ?Predicate, ?Object) is nondet. URI → Atom Literal → literal(Atom) literal(lang(Code, Atom)) literal(type(URI, Atom))
30
30 ICLP-09 URI: XML Namespaces Namespaces are expanded at compile-time by means of rules for goal_expansion/2, so rdf(S, 'http://www.w3.org/1999/02/22-rdf-syntax-ns#type',http://www.w3.org/1999/02/22-rdf-syntax-ns#type 'http://e-culture.multimedian.nl/ns/getty/ulan#Person'). rdf(S, rdf:type, ulan:'Person'). Toplevel and debugger results are made readable again using portray/1 Can be written as
31
31 ICLP-09 A simple example ?- module(rdfs_entailment). rdfs_entailment: ?- rdf(X, rdf:type, ulan:'Person'), rdf(X, rdfs:label, literal('Matisse, Henri')), rdf(Work, dc:creator, X).
32
32 ICLP-09 Optimising ?- In = rdf(X, rdf:type, ulan:'Person'), rdf(X, rdfs:label, literal('Matisse, Henri')) rdf(Work, dc:creator, X), rdf_optimise(In, Goal). Goal = rdf(X, rdfs:label, literal('Matisse, Henri')), rdfql_carthesian([ bag([], rdf(X, rdf:type, ulan:'Person')), bag([Work], rdf(Work, dc:creator, X)) ])).
33
33 ICLP-09 Advantages of Prolog over SPARQL Flexibility and reuse: We can mix with arbitrary Prolog code We can name and combine queries We can do recursion This is similar to SQL vs. Prolog, but Processing RDF involves pattern-matching, rules and recursion, while datatypes are less important.
34
34 ICLP-09 Prolog ↔ SPARQL One SPARQL query to get result One SPARQL query to get result Multiple SPARQL queries Multiple SPARQL queries Fetch triple-by-triple and process in client Fetch triple-by-triple and process in client
35
35 ICLP-09 Reasoning Reasoning is connected to a language (RDFS, OWL, SWRL, …) Reasoning derives facts from the triple store that are not explicitly provided in the dataset.
36
36 ICLP-09 Options for Reasoning (I) Reasoning adds (virtual) triples (entailment): The only API is rdf(S,P,O) Forward reasoning Easy to implement Difficult to handle database updates Can explode using richer languages (e.g., OWL) Backward reasoning Non-termination under SLD resolution Need for optimization of conjunctions Easy to provide alternative reasoners
37
37 ICLP-09 Alternative entailment reasoners as Prolog modules Core RDF-DB rdf/3 Core RDF-DB rdf/3 RDFS rdf/3 RDFS rdf/3 OWL-Horst rdf/3 OWL-Horst rdf/3....
38
38 ICLP-09 Options for Reasoning (II) Based on Abstract Syntax Dedicated high-level API Forward reasoning Transformation (Thea OWL(-2) library) Backward reasoning Thea: http://www.semanticweb.gr/TheaOWLLib/http://www.semanticweb.gr/TheaOWLLib/ By Vangelis Vassiliadis and Chris Mungall
39
39 ICLP-09 Reasoning with Abstract Syntax API Core RDF-DB rdf/3 Core RDF-DB rdf/3.... Thea (OWL-2) subClassOf/2 Thea (OWL-2) subClassOf/2 Forward: Transformation RDFS rdfs_individual_of/2 rdfs_subclass_of/2... RDFS rdfs_individual_of/2 rdfs_subclass_of/2... Backward: Prolog rules
40
40 ICLP-09 Options for reasoning (summary) Entailment-based Uniform query API → app can switch entailment Query API is low-level (Using forward reasoning) entailed graph is added to database → Difficult to deal with multiple languages Abstract-syntax based Each language has its own query API Query API is high-level Easy to deal with multiple languages
41
41 ICLP-09 A closer look at the RDF store: requirements Efficient in any instantiation-pattern (full indexing) Deal with property-hierarchy Deal with owl:sameAs Literal indexing (prefix, full-text,...) Scalable to 10-100 M-triples
42
42 ICLP-09 Options for rdf/3 (I: Using Prolog) Prolog dynamic database We need multiple indexes (e.g., YAP) Cannot exploit domain-specific aspects: Property-hierarchy matching Facts are ground, unordered and support limited types Hard to provide statistics for the optimizer because they are also domain-specific
43
43 ICLP-09 Options for rdf/3 (II: Using an external store) External store Slow connection (need to intern/extern URI-as- atom) We do not want (most of) the reasoning
44
44 ICLP-09 Options for rdf/3 (III: Dedicated C) Using dedicated C-library Can optimize for space based on limited datatypes Use atom-handles in the database (no intern/extern) Sort literals in an AVL-tree (prefix search) Keep counts (for query optimizations) Fast binary load/save format
45
45 ICLP-09 RDF Processing (summary) Expressing graph-patterns mixed with auxiliary Prolog is easy This is enough for a large part of RDF processing in semantic web applications Reasoning Forward closure (easy, big, no changes) Backward: termination issues (tabling can help) Extending rdf/3 ↔ Using abstract language
46
46 ICLP-09 Part III Web-Applications
47
47 ICLP-09 Database Web-Application Reference Architecture (Three Tier Model) Presentation generation Presentation generation Application Logic Application Logic Web 2.0 JavaScript Web Browser Web 3.0 (Semantic Web) RDF Linked Data
48
48 ICLP-09 Protocols and Standards RDF Database RDF Database Application Logic Application Logic HTTP SPARQL Prolog HTTP ?
49
49 ICLP-09 Prolog-to-HTTP Tomcat.NET... Tomcat.NET... JPL InterProlog PrologBeans... JPL InterProlog PrologBeans... Prolog Web-ServerInterfaceApplication Need to program in Tomcat/.NET/... & Prolog Difficult deployment JPL: One process (JNI/C interface) Fast, but hard to debug InterProlog/Prologbeans/... (proprietary network) HTTP
50
50 ICLP-09 Prolog-to-HTTP Easy debugging Easily extend the HTTP interface Not `industry standard' But … many languages provide an HTTP server library Prolog Web-ServerApplication Prolog library HTTP Prolog library HTTP Interface
51
51 ICLP-09 Apache Deployment Using Apache reverse-proxy and load-balancer ServerName www.swi-prolog.org ProxyPass / http://localhost:3040/ Prolog VNC Port 80 Port 3040
52
52 ICLP-09 VNC server console
53
53 ICLP-09 /api/search?q=picasso&count=100 :- use_module(library(http/http_dispatch)). :- use_module(library(http/http_parameters)). :- use_module(library(http/http_json)). :- http_handler('/api/search', search, []). search(Request) :- http_parameters(Request, [ q(Q, []), start(S, [default(0)]), count(C, [default(25)]) ]), search(Q, S, C, Results), reply_json(Results).
54
54 ICLP-09 Summary HTTP support Writing the HTTP-server in Prolog gives us: Good single-language development environment Incremental compilation: life-updating the server Deployment can be direct or through a proxy Not so big: 12,000 lines for Core HTTP client and server HTML and JSON read/write Parameters, sessions, authorization, logging
55
55 ICLP-09 Part IV Creating Interactive Web Applications using Prolog
56
56 ICLP-09 Web of Documents (Original drawing by Tim Burners Lee)
57
57 ICLP-09 Interactive Web-Applications Server needs to keep track of client (sessions) Client needs light-weight updates of the interface … but HTTP is state-less …
58
58 ICLP-09 Introducing State Negotiate a session-key between client and server Server associates state with this key Client modifies the interface using JavaScript → AJAX
59
59 ICLP-09 What is AJAX not?
60
60 ICLP-09 Case Create a web-interface for the N-queens problem Interaction Select size of board Select implementation (Prolog ↔ clp(FD)) Get first solution Get next solution or stop State in backtrackable Prolog program By Torbjörn Lager, Markus Triska, Jan Wielemaker
61
61 ICLP-09 Step I: create initial page DOM Browser JavaScript WEB Application Server (HTTP) WEB Application Server (HTTP) Initial HTML Page Builds initial DOM Initial HTML +JS
62
62 ICLP-09
63
63 ICLP-09 DOM Browser JavaScript WEB Application Server (HTTP) WEB Application Server (HTTP) Initial HTML Page Builds initial DOM Initial HTML +JS Local Interaction Step II: Add local interaction
64
64 ICLP-09
65
65 ICLP-09 Options... <input type="button" id='opts' name="options" value="Options …" onClick="showOptions(true)"> function showOptions(show) { document.getElementById("options").style.display = show ? "block" : "none"; }
66
66 ICLP-09 OK: applyOptions() function applyOptions() { var size = parseInt(document.getElementById("size").value); if ( document.getElementById("queens").checked == true ) { algorithm = "queens"; } else { algorithm = "clpfd_queens"; } if ( size 40 ) { alert("Size must be in the range 2..40"); } else { boardsize = size; showOptions(false); document.getElementById("N").innerHTML = size; document.getElementById("who").innerHTML = (algorithm == "queens" ? "Prolog" : "clp(FD)"); document.getElementById("board").innerHTML = board(boardsize, boardwidth); } Set client state in global variables Set client state in global variables Update the interface by changing the DOM Update the interface by changing the DOM → NO server interaction
67
67 ICLP-09
68
68 ICLP-09 Step-III: Add server interaction DOM Browser JavaScript WEB Application Server (HTTP) WEB Application Server (HTTP) Initial HTML Page Builds initial DOM Initial HTML +JS Local Interaction Server Interaction
69
69 ICLP-09 First... function first() { working(); YAHOO.util.Connect.asyncRequest( 'GET', "/prolog/first?goal="+algorithm+"("+boardsize+",L)", { success: update }); } <input type="button" id='first' name="first" value="First" onClick="first()"> Server request What to do when the server responds? What to do when the server responds?
70
70 ICLP-09 Client code-fragment: handle response function update(o) { var solution = YAHOO.lang.JSON.parse(o.responseText); if (solution.solution) { if ( solution.next == true ) { setButtons(true); } else { setButtons(false); } clearBoard(); setQueens(solution.solution.args[1].value); document.getElementById("msg").innerHTML = "CPU: " + solution.time.toPrecision(2) + " sec."; } else if ( solution.error ) { setButtons(false); document.getElementById("msg").innerHTML = " "+solution.error+" "; } else { setButtons(false); document.getElementById("msg").innerHTML = "There are no more solutions."; } Process as JSON Update DOM based on JSON reply Update DOM based on JSON reply
71
71 ICLP-09 setQueens() Replace DOM fragment Replace DOM fragment function setQueens(squareList) { for (var i = 1; i <= boardsize; i++) { var id = i + "-" + (squareList[i-1].value); document.getElementById(id).innerHTML = " "; }
72
72 ICLP-09
73
73 ICLP-09 Backtracking state in the server Thread session-1 Thread session-1 Thread session-N Thread session-N HTTP Worker thread HTTP Worker thread JSON Document JSON Document Backtrack Prolog-term GET /prolog/next session-id=1 State
74
74 ICLP-09 Backtracking state solve(Goal, Bindings, ThreadID) :- thread_self(Me), thread_statistics(Me, cputime, T0a), State = client(ThreadID, T0a), solve_2(Goal, Bindings, Solution), State = client(Client, T0), thread_statistics(Me, cputime, T1), Time is T1 - T0, solution_time(Solution, Time), nb_setarg(2, State, T1), debug(prolog_server, 'Sending: ~q', [Solution]), thread_send_message(Client, Solution), solution_type(Solution, Type), ( Type == last -> true ; Type == true -> catch( thread_get_message(command(From, Command)), _, Command = stop), debug(prolog_server, 'Command: ~q', [Command]), nb_setarg(1, State, From), Command == stop ; true ). (Guarded) actual goal (Guarded) actual goal Send reply Wait for user Wait for user
75
75 ICLP-09 AJAX has many architectures From http://www.openajax.org/member/wiki/Whitepaper_20060730
76
76 ICLP-09 Where does the JavaScript come from? Widget Library AjaxAnywhere, MochiKit, YUI,... Widget Library AjaxAnywhere, MochiKit, YUI,... User Code - Instantiation - Set attributes - Refine methods User Code - Instantiation - Set attributes - Refine methods
77
77 ICLP-09 Options for generating application JavaScript Write a JavaScript file and link it from the HTML page Code is in two places → Good split if API is stable Poor for prototyping and often changing APIs Write JavaScript in Prolog strings and include in page Messy syntax (Python """long string""")
78
78 ICLP-09 Generate from Prolog terms? Works well for HTML (e.g., html_write, PiLLoW) But, JavaScript customization often places code- fragments in object-properties No simply interface such as e.g., XPCE: Create/Set property/Call method A full mapping of JS code to Prolog syntax is probably not transparent enough for users
79
79 ICLP-09 Wrap-Up The “Web of data” is out there Prolog is an excellent tool for processing RDF The interactive “Web 2.0” is out there Web 2.0 is (relatively) language independent Prolog is a suitable server component for Web 2.0
80
80 ICLP-09 Future Directions Enhance RDF support: Improve scalability Higher level reasoning Provide tabling Generalise optimizers Enhance web-programming support Explore cleaner integration with AJAX Merge into Prolog-Commons Initiative
81
81 ICLP-09 Links http://www.swi-prolog.org http://e- culture.multimedian.nl/software/ClioPatria.shtml http://e- culture.multimedian.nl/software/ClioPatria.shtml http://www.swi-prolog.org/Publications.html
82
82 Part of the Dutch knowledge- economy project MultimediaN Partners: VU, CWI, UvA, DEN, ICN People: Alia Amin, Lora Aroyo, Mark van Assem, Victor de Boer, Lynda Hardman, Michiel Hildebrand, Laura Hollink, Marco de Niet, Borys Omelayenko, Marie-France van Orsouw, Jacco van Ossenbruggen, Guus Schreiber Jos Taekema, Annemiek Teesing, Anna Tordai, Jan Wielemaker, Bob Wielinga Artchive.com, RKD, Rijksmuseum Amsterdam, Dutch ethnology musea (Amsterdam, Leiden), National Library (Bibliopolis) http://e-culture.multimedian.nl
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.