Presentation is loading. Please wait.

Presentation is loading. Please wait.

Querying XML – Concluded Introduction to Views Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems October 9, 2003 Some.

Similar presentations


Presentation on theme: "Querying XML – Concluded Introduction to Views Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems October 9, 2003 Some."— Presentation transcript:

1 Querying XML – Concluded Introduction to Views Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems October 9, 2003 Some slide content courtesy of Susan Davidson, Dan Suciu, & Raghu Ramakrishnan

2 2 Administrivia  Final exam will be take-home, given out the last day of class and due Dec. 18 th at 1PM  We’ll try to post HW3 answers on the Web shortly  Project page on web site now links to more detailed descriptions and references for each project

3 3 XQuery’s Basic Form  Has an analogous form to SQL’s SELECT..FROM..WHERE..GROUP BY..ORDER BY  The model: bind nodes (or node sets) to variables; operate over each legal combination of bindings; produce a set of nodes  “FLWOR” statement: for {iterators that bind variables} let {collections} where {conditions} order by {order-conditions} return {output constructor}

4 4 “Iterations” in XQuery A series of (possibly nested) FOR statements assigning the results of XPaths to variables for $root in document(“http://my.org/my.xml”)http://my.org/my.xml for $sub in $root/rootElement, $sub2 in $sub/subElement, …  Something like a template that pattern-matches, produces a “binding tuple”  For each of these, we evaluate the WHERE and possibly output the RETURN template  document() or doc() function specifies an input file as a URI

5 5 Two XQuery Examples { for $p in document(“dblp.xml”)/dblp/proceedings, $yr in $p/yr where $yr = “1999” return {$p} } for $i in doc (“dblp.xml”)/dblp/inproceedings[author/text() = “John Smith”] return { $i/title/text() } { $i/@key } { $i/crossref }

6 6 Nesting in XQuery Nesting XML trees is perhaps the most common operation In XQuery, it’s easy – put a subquery in the return clause where you want things to repeat! for $u in doc(“dblp.xml”)/universities where $u/country = “USA” return { $u/title} { for $mt in $u/../mastersthesis where $mt/year/text() = “1999” return $mt/title }

7 7 Collections & Aggregation in XQuery  In XQuery, many operations return collections  XPaths, sub-XQueries, functions over these, …  The let clause assigns the results to a variable  Aggregation simply applies a function over a collection, where the function returns a value (very elegant!) let $allpapers := doc(“dblp.xml”)/dblp/article return {fn:count(fn:distinct-values($allpapers/authors)) } {for $paper in doc(“dblp.xml”)/dblp/article let $pauth := $paper/author return {$paper/title} { fn:count($pauth) } }

8 8 Sorting in XQuery  SQL actually allows you to sort its output, with a special ORDER BY clause (which we haven’t discussed)  XQuery borrows this idea  In XQuery, what we order is the sequence of “result tuples” output by the return clause: for $x in doc(“dblp.xml”)/proceedings order by $x/title/text() return $x

9 9 Distinct-ness  XQuery has a notion that DISTINCT-ness happens as a function over a collection  But since we have nodes, we can do duplicate removal according to value or node  Can do fn:distinct-values(collection) to remove duplicate values, or fn:distinct-nodes(collection) to remove duplicate nodes for $years in fn:distinct-values(doc(“dblp.xml”)//year/text() return $years

10 10 Querying & Defining Metadata – Can’t Do This in SQL  Can get a node’s name by querying node-name(): for $x in document(“dblp.xml”)/dblp/* return node-name($x)  Can construct elements and attributes using computed names: for $x in document(“dblp.xml”)/dblp/*, $year in $x/year, $title in $x/title/text(), element node-name($x) { attribute {“year-” + $year} { $title } }

11 11 XQuery Summary  Very flexible and powerful language for XML  Clean and orthogonal: can always replace a collection with an expression that creates collections  DB and document-oriented (we hope)  The core is relatively clean and easy to understand  Actually Turing Complete – we’ll talk more about XQuery functions soon

12 12 XSL(T): The Bridge Back to HTML  XSL (XML Stylesheet Language) is actually divided into two parts:  XSL:FO: formatting for XML  XSLT: a special transformation language  We’ll leave XSL:FO for you to read off www.w3.org, if you’re interestedwww.w3.org  XSLT is actually able to convert from XML  HTML, which is how many people do their formatting today  Products like Apache Cocoon generally translate XML  HTML on the server side

13 13 A Different Style of Language  XSLT is based on a series of templates that match different parts of an XML document  There’s a policy for what rule or template is applied if more than one matches (it’s not what you’d think!)  XSLT templates can invoke other templates  XSLT templates can be nonterminating (beware!)  XSLT templates are based on XPath “match”es, and we can also apply other templates (potentially to “select”ed XPaths)  Within each template, we describe what should be output  (Matches to text default to outputting it)

14 14 An XSLT Stylesheet This is DBLP …

15 15 Results of XSLT Stylesheet Paper1 Smith Chakrabarti Gray Paper2 This Is DBLP Paper1 Smith Paper2 Chakrabarti Gray

16 16 What XSLT Can and Can’t Do  XSLT is great at converting XML to other formats  XML  diagrams in SVG; HTML; LaTeX  …  XSLT doesn’t do joins (well), it only works on one XML file at a time, and it’s limited in certain respects  It’s not a query language, really  … But it’s a very good formatting language  Most web browsers (post Netscape 4.7x) support XSLT and XSL formatting objects  But most real implementations use XSLT with something like Apache Cocoon  You may want to use XSL/XSLT for your projects – see www.w3.org/TR/xslt for the spec www.w3.org/TR/xslt

17 17 Wrapping Up XQuery We’ve seen three XML manipulation formalisms today:  XPath: the basic language for “projecting and selecting” (evaluating path expressions and predicates) over XML  XQuery: a statically typed, Turing-complete XML processing language  XSLT: a template-based language for transforming XML documents  Each is extremely useful for certain applications!

18 18 Views  A view is a named query  We can use the name of the view to invoke the query SQL: CREATE VIEW S(A,B,C) AS SELECT A,B,C FROM R WHERE R.A = “123” XQuery: declare function S() as element(content)* { for $r in doc(“R”)/root, $a in $r/a, $b in $r/b, $c in $r/c where $a = “123” return {$a, $b, $c} }

19 19 What’s Useful about Views Providing security/access control  We can assign users permissions to see only particular views Can be used as relations in other queries  Allows the user to query things that make more sense Describe transformations from one schema (the base relations) to another (the output of the view)  This will be incredibly useful in data integration, discussed soon Can be materialized  i.e., the results of the view can be saved on disk (as in caching)  Techniques exist for using materialized views to answer other queries

20 20 Views Should Stay Fresh  Views (sometimes called intensional relations) behave, from the perspective of a query language, exactly like base relations (extensional relations)  But there’s an association that should be maintained:  If tuples change in the base relation, they should change in the view (whether it’s materialized or not)  If tuples change in the view, that should reflect in the base relation(s)

21 21 View Maintenance and the View Update Problem  There exist algorithms to incrementally recompute the results of a view when the base relations change  We can try to do the same thing in reverse  However, there are lots of views that aren’t easily updatable:  We can ensure views are updatable by enforcing certain constraints (e.g., no aggregation) AB 12 22 BC 24 23 R S ABC 124 123 224 223 R⋈SR⋈S delete?

22 22 Reasoning about Queries and Views  Given that views have so many uses, it’s helpful to know how and why they’re so useful  First we need a clean way of describing views (let’s assume a relational model for now)  Should be declarative  Should be less complex than SQL  Doesn’t need to support all of SQL – aggregation, for instance, may be more than we need

23 23 Datalog and Datalog Rules  Relational calculus was a first order logic-based way of expressing queries, where our formulas were:  = R(v 1, …, v n ) | v i = v j | v i = c |  Æ  ’ |  Ç  ’ | 8 x.  | 9 x.   Recall that relational calculus (and first-order logic) was relationally complete but couldn’t express some operations like transitive closure of graph edges in a relation  Datalog is based on FO logic and “Horn rules”:  head(distinguished variables) :- subgoal(variables) & … & conditions view(v 1, …, v n ) :- r 1 (v 1, …, v m ) & r 2 (v3, v4, …, v n ) & v 1 = “x”  A single Datalog rule, which has no “ Ç,” “ :,” “ 8 ” can express select, project, and join – a conjunctive query

24 24 To Come…  How we answer Datalog queries  Reasoning about whether Datalog rules are “safe”  Comparing Datalog rules to figure out when a view has enough info to answer a different query  Datalog for mapping between schemas  … and much, much more!

25 25 Midterm Overview  I expect you to be able to:  Apply RA and DRC to answer queries  Write simple SQL queries, including joins and aggregation  Write simple XQueries, including nesting of output  Be able to draw meaningful ER diagrams  Tell if something’s in 3NF or BCNF, given FDs  Decompose relations using the 3NF algorithm, given a cover  Given a set of FDs, be able to compute a cover  You won’t need to:  Compute a closure  Apply the BCNF algorithm  Write 10-page SQL queries


Download ppt "Querying XML – Concluded Introduction to Views Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems October 9, 2003 Some."

Similar presentations


Ads by Google