A Visual Query Language for Business Processes Catriel Beeri Hebrew University Anat Eyal, Simon Kamenkovich, Tel Aviv University Tova Milo
Beeri, Eyal, Kamenkovitch, Milo Outline Introduction and motivation Overview by example System and query formal model Compact representation of query answers Implementation Summary
Beeri, Eyal, Kamenkovitch, Milo The Standard Example Client PO Service Credit Service Inventory Service Purchase Order Credit Check Reserve Inventory Credit Response Inventory Response Invoice Consolidate Results
Beeri, Eyal, Kamenkovitch, Milo Web Services Meet Business Processes Web Service 1 Web Service 2 Web Service 3 Web Service 4 Web Service 5 Web Service n Company A business process Local to company A At company BOn the Web
Beeri, Eyal, Kamenkovitch, Milo Recent History of Business Process Standards 2000/05 XLang (Microsoft) 2001/03 BPML (Intallio et al) 2001/05 WSFL (IBM) 2001/06 BPSS (ebXML) 2002/03 BPEL4WS 1.0 (IBM, Microsoft) BPEL4WS 1.1 (OASIS) 2002/062003/01 WS-Choreography (W3C) 2003/04 WSCI (Sun et al) WSCL (HP) 2002/08 BPEL
Beeri, Eyal, Kamenkovitch, Milo BPEL in a nutshell Business process -- control and data flow Actions: – atomic –compound : fork, join, while, procedure/process call,… Process spec. represented in XML Commercial products: – design tools with “conceptual” model –Application generators compile to executable code –Application servers execute the code
Beeri, Eyal, Kamenkovitch, Milo Points to note Not just a modeling language: applications are automatically generated from the (relatively declarative) specs. There will be plenty of such specs around Specs are data; a lot of interest in querying them: “What kind of credit services are used (in)directly”? “How can I buy a plane ticket ?” “Can one get a price quote without giving first credit card info?” motivation for developing a query facility
Beeri, Eyal, Kamenkovitch, Milo Can a QL be based on BPEL? NO! The BPEL XML representation is machine oriented, unfit for human consumption It decomposes the system diagram into a (XML-based) relational representation of nodes and edges loss of understanding + queries will require lots of joins Need: Similarity of system and query models
Beeri, Eyal, Kamenkovitch, Milo Why not XQuery? (on XML representation of conceptual model) Additional requirements from query language : Graphs rather than trees Queries about control & data flow paths Need to query at different level of granularity –Zoom-in/zoom-out P2P architecture: Specs are distributed We use a home-brewed QL
Beeri, Eyal, Kamenkovitch, Milo Querying Specs, not Runs Not the same semantics –Q: “ does the spec contain these two operations ? ” –V: “ can there be a run that contains these two operations?“ Querying the specs is –Cheaper (specs are data, not possible executions) –Gives a reasonably good approximation for actual runs A (practical) problem with verification: It needs a clear semantics...
Beeri, Eyal, Kamenkovitch, Milo Outline Introduction and motivation Overview by example System and query formal model Compact representation of query answers Implementation Summary
Beeri, Eyal, Kamenkovitch, Milo A system specification Property, data & activity nodes Control and data flow edges A process is a di-graph of atomic and compound activities Compound activities can be zoomed-in (a-la statecharts)
Beeri, Eyal, Kamenkovitch, Milo Travel Agency Process
Beeri, Eyal, Kamenkovitch, Milo Zoom In
Beeri, Eyal, Kamenkovitch, Milo Queries Use process patterns (like tree patterns for xml) Single/double-headed edges (compare to / and // in XPath) – – edges – – paths of arbitrary length Single/double-bounded activities: –w/o zoom-in –unbounded zoom-in Allow * as node label Node label variables Mark requested nodes/edges by
Beeri, Eyal, Kamenkovitch, Milo Query1: provided operations?
Beeri, Eyal, Kamenkovitch, Milo Query2: used credit card services? local
Beeri, Eyal, Kamenkovitch, Milo Query3: search without login?
Beeri, Eyal, Kamenkovitch, Milo Query4: data flow Data elements affected by searchRequest and affecting returnTripResults
Beeri, Eyal, Kamenkovitch, Milo Outline Introduction and motivation Overview by example System and query formal model Compact representation of query answers Implementation Summary
Beeri, Eyal, Kamenkovitch, Milo Formal Model System: Process graphs + implementation function Graph refinement (graph rewriting) Query: Process patterns + implementation function
Beeri, Eyal, Kamenkovitch, Milo System example implementation
Beeri, Eyal, Kamenkovitch, Milo Refinement (rewriting) searchTrip refined to
Beeri, Eyal, Kamenkovitch, Milo Query Process patterns: * node labels, node variables Nodes / edges can be marked as transitive –Transitive edges: paths of arbitrary length –Transitive node: refinements of arbitrary depth Nodes / edges can be marked as output Patterns related by a query implementation function
Beeri, Eyal, Kamenkovitch, Milo Query semantics An embedding: a mapping –from: query graphs (with uniquely identifiable nodes) –to: [refinements of] process graphs satisfying conditions: (nodes): preserves nodes types and labels (edges): edge mapped to edge mapped to a path (imp): preserves implementation relationships A result: image of query graph under an embedding Answer: all results
Beeri, Eyal, Kamenkovitch, Milo Examples: A B C A1 B2 B3 *4 A1 B2 B3 C4 answersystemquery
Beeri, Eyal, Kamenkovitch, Milo Examples: A B C A1 B2 B3 *4 A1 B2 B3 C4 Answer: System: Query A B C D A1 B2 D3 A1 B2 C B D3
Beeri, Eyal, Kamenkovitch, Milo Outline Introduction and motivation Overview by example System and query formal model Compact representation of query answers Implementation Summary
Beeri, Eyal, Kamenkovitch, Milo Many fork/joins exponential number of paths Large or Infinite answers!
Beeri, Eyal, Kamenkovitch, Milo Infinite answers can be generated by either Cycles in process graph (example seen previously) Cycles (recursion) in zoom-in relationships Infinite # of refinements Path vars (double headed edges) + cycles infinite # of results
Beeri, Eyal, Kamenkovitch, Milo Recursive zoom-in
Beeri, Eyal, Kamenkovitch, Milo A query with an infinite answer
Beeri, Eyal, Kamenkovitch, Milo Finite representation
Beeri, Eyal, Kamenkovitch, Milo Systems and queries are essentially regular graph grammars Bad news: not closed under intersection (general case) Good news: –Our systems and queries are sufficiently simple: Psize representation (as a regular grammar) can be computed in Ptime (data complexity!) skip
Beeri, Eyal, Kamenkovitch, Milo Answer construction Observations: Embeddings (results) can be composed of query pattern to system process embeddings (results) glue: the constraint (imp) Multiple embeddings can be represented by a shared node mapping (that satisfies (nodes) and (edges) -- a homomorphism) Construction of node mappings is the core problem
Beeri, Eyal, Kamenkovitch, Milo Node mapping/answer construction The simple case: from p Q to p S (not into refinement) : Find node mappings that satisfy (nodes), (edges): NP-complete in p Q (query), Ptime in p S (data) Results are 1-1 images of mappings: – each query node is distinct, – new nodes and edges for paths (images of ) A B C D A1 B2 D3 A1 B2 C B D3
Beeri, Eyal, Kamenkovitch, Milo The complex case: from p Q to refinement of p S : A mapping may be split into parts: –into p S itself –into implementation of a compound action of p S, and so on with double-headed edges – to any depth, including cycles Observations: Only sub-graphs of p Q with single entry/exit can be mapped to an implementation In one mapping, a disjoint set of such graphs can be mapped to an implementation
Beeri, Eyal, Kamenkovitch, Milo The construction: For each p Q and p S : For disjoint set J of such sub-graphs, construct mappings for p Q /J (each member of J replaced by $* node) to p S Do same for also for these sub-graphs Glue together using by a regular grammar: –$* nodes (suitably labeled) are non-terminals –Each node (for G) is associated with the mappings for G
Beeri, Eyal, Kamenkovitch, Milo C Query : A B B F D E C A B System : A C
Beeri, Eyal, Kamenkovitch, Milo Extensions to QL: OK extensions: –label predicates, –Regular path expressions (on node labels), –Negation, –Joins on node labels Not OK extensions : –Joins on path variables The result is not a regular graph grammar Emptiness is undecidable NP-hard even without cycles and recursion
Beeri, Eyal, Kamenkovitch, Milo Outline Introduction and motivation Overview by example System and query formal model Compact representation of query answers Implementation Summary
Beeri, Eyal, Kamenkovitch, Milo Simple and intuitive query formulation similar to how processes are specified Operates in distributed environment System and queries modeled as graph grammars allows compact representation of large/infinite answers Ignores semantics of composite actions AXML as an implementation platform supports –Transparent distribution –Taking advantage of built-in optimizations
Beeri, Eyal, Kamenkovitch, Milo High Lights of Model & QL Base system unit: directed node-labeled graph QL primitives|: /, // (forward axis only) Node equality Node label equality Node label predicates Regular expressions on node labels in paths Negation on sub-patterns Can be seen as adaptation of XPath to di-graphs, but selection not restricted to one path
Beeri, Eyal, Kamenkovitch, Milo Advanced system structure: an infinite family of directed node-labeled graphs, specified by a graph grammar Additional QL primitives: double-headed node: query on all graphs derived from a given graph Answer representation as a graph grammar
Beeri, Eyal, Kamenkovitch, Milo Future work Investigate queries on logs Does this pattern exist? “Querying of runs” – essentially verification “Marxist” extensions to QL primitives