SchemaLog – A Visual Perspective CPSC 534B Laks V.S. Lakshmanan UBC (names of schema components abbreviated.)
Basic Queries … … p s P S 1 dbStateA salesInfo … … P S 2 dbStateB salesInfo … … s S 3 dbStateC P S 1 > 15M & S 2 > 15M & S 3 > 15M. All positions treated alike. Condition box. Distinguished (i.e., output variables.) Conjunctive query!
Basic Queries (contd.) … … b P B S 2 dbStateB salesInfo … … b s B S 3 dbStateC P S 2 > 20M & S 3 > 20M.
Basic Queries – Observations Can use variables anywhere. Positions tell which is which. Order of arguments doesn’t matter. Arguments are named (unlike in datalog or standard FOL or RC). Tuple var’s – use is mostly existential (for now.) So far, output = set of tuples of bindings of (distinguished) variables. How to: Express union of CQs? Recursive queries? (what is recursion for SchemaLog anyway?) Construct output in a way we’d like? E.g., one relation per brand in Q 2.
Output construction (1) … … t S stocks … … d c D P S … … d S D P c D
Output construction (2) … … t S stocks … … d X D P S … … d S D P X D X date
Other issues. Try out other possibilities. What output structure do you get in each case? Union of CQs = set of (non-recursive) rules. So what’s recursion? Relation var’s can cause recursion, depending on semantics adopted.
What’s recursion in SchemaLog? Example Template: db::X[ ], …. Remember: want to assume safety of rules. Def. Similar to that for datalog. Question: What does X range over? Does it include only existing database relations? Or does it also include newly created ones?
What’s recursion? Concrete Example: in::out[f(T,U): a->A, c->B] in::X[T: a->A, Any->Y], Any <> a, Any <> b, in::r[U: Any->Y, b->B]. Consider instance of in database: a c … … r c b … … s Does X range over {r, s} or over {r, s, out}? Former case: no recursion Latter case: recursion, since: + can join r and s + but also out and s while creating out!
More on SchemaLog recursion. Call former case “restricted semantics” and latter case “unrestricted semantics”. Try previous example with your own sample data. Can you predict max. # iterations needed to evaluate relation out no matter what the input database? Use unrestricted semantics. For datalog, recursion = cycle(s) in predicate dependency graph. How would you characterize recursion in SchemaLog (under either semantics)?
Using function symbols. SchemaLog allows arbitrary first- order terms in each position – db, relation, attribute, value, tuple-id. Makes for powerful restructuring (and hence visualization) of information.