1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005
2 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Agenda Security & web apps RegEx support in 10g Oracle & XML Data warehousing More on the PL/SQL labs Any more lab?
3 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Review: Why security is hard It’s a “negative deliverable” It’s an asymmetric threat Tolstoy: “Happy families are all alike; every unhappy family is unhappy in its own way.” Analogs: “homeland”, jails, debugging, proof- reading, Popperian science, fishing, MC algs So: fix biggest problems first
4 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 DB users have privileges SELECT : read access to all columns INSERT(col-name) : can insert rows with non- default values in this column INSERT : can insert rows with non-default values in all columns DELETE REFERENCES(col-name) : can define foreign keys that refer to (or other constraints that mention) this column TRIGGER : triggers can reference table EXECUTE : can run function/SP
5 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Granting privileges (Oracle) One method of setting access levels Creator of object automatically gets all privileges to it Possible objects: tables, whole databases, stored functions/procedures, etc. .* - all tables in DB A privileged user can grant privileges to other users or groups GRANT privileges ON object TO user GRANT SELECT ON mytable TO someone WITH GRANT OPTION;
6 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Granting and revoking Privileged user has privileges Privileged-WGO user can grant them, w/wo GO Granter can revoke privileges or GO Revocation cascades by default To prevent, use RESTRICT (at end of cmd) If would cascade, command fails Can change owner: ALTER TABLE my-tbl OWNER TO new-owner; ALTER TABLE my-tbl OWNER TO new-owner;
7 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Granting and revoking What we giveth, we may taketh away mjohnson: (effects?) george: (effects?) mjohnson: (effects?) GRANT SELECT, INSERT ON my-table TO george WITH GRANT OPTION; GRANT SELECT ON my-table TO laura; REVOKE SELECT ON my-table FROM laura;
8 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Role-based authorization In SQL-1999, privileges assigned with roles For example: Student role Instructor role Admin role Each role gets to do same (sorts of) things Privileges assigned by assigning role to users GRANT SELECT ON my-table TO employee; GRANT employee TO billg;
9 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Passwords DBMS recognizes your privileges because it recognizes you how? Storing passwords in the DB is a bad idea
10 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Hashed or digested passwords One-way hash function: 1. computing f(x) is easy; 2. Computing f -1 (y) is hard/impossible; 3. Finding some x2 s.t. f(x2) = f(x) is hard/imposs “collisions” Intuitively: seeing f(x) gives little (useful) info on x x “looks random” PRNGs MD5, SHA-1 RFID for cars:
11 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Built-in accounts Many DBMSs (and OSs) have built-in demo accounts by default In some versions, must “opt out” MySQL: root/(blank) (closed on sales) Oracle: scott/tiger (was open on sales last year) SQLServer: sa/(blank/null)
12 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Query-related: Injection attacks Here’s a situation: Prompt for user/pass Do lookup: If found, user gets in test.user table in MySQL p.txt p.txt Apart from no hashing, is this safe? SELECT * FROM users WHERE user=u AND password=p; SELECT * FROM users WHERE user=u AND password=p;
13 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Injection attacks We expect to get input of something like: user: mjohnson pass: secret SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user= 'mjohnson' AND password = 'secret'; SELECT * FROM users WHERE user= 'mjohnson' AND password = 'secret';
14 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Injection attacks – MySQL/Perl/PHP Consider another input: user: ' OR 1=1 OR user = ' pass: ' OR 1=1 OR pass = ' SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = ' ' OR 1=1 OR user = ' ' AND password = ' ' OR 1=1 OR pass = ' '; SELECT * FROM users WHERE user = ' ' OR 1=1 OR user = ' ' AND password = ' ' OR 1=1 OR pass = ' '; SELECT * FROM users WHERE user = '' OR 1=1 OR user = '' AND password = '' OR 1=1 OR pass = ''; SELECT * FROM users WHERE user = '' OR 1=1 OR user = '' AND password = '' OR 1=1 OR pass = '';
15 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Injection attacks – MySQL/Perl/PHP Consider this one: user: your-boss ' OR 1=1 # pass: abc SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = 'your-boss ' OR 1=1 #' AND password = 'abc'; SELECT * FROM users WHERE user = 'your-boss ' OR 1=1 #' AND password = 'abc'; SELECT * FROM users WHERE user = 'your-boss' OR 1=1 #' AND password = 'abc'; SELECT * FROM users WHERE user = 'your-boss' OR 1=1 #' AND password = 'abc';
16 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Injection attacks – MySQL/Perl/PHP Consider another input: user: your-boss pass: ' OR 1=1 OR pass = ' SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = 'your-boss' AND password = ' ' OR 1=1 OR pass = ' '; SELECT * FROM users WHERE user = 'your-boss' AND password = ' ' OR 1=1 OR pass = ' '; SELECT * FROM users WHERE user = 'your-boss' AND password = '' OR 1=1 OR pass = ''; SELECT * FROM users WHERE user = 'your-boss' AND password = '' OR 1=1 OR pass = '';
17 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Multi-command inj. attacks (other DBs) Consider another input: user: ' ; DELETE FROM users WHERE user = ' abc ' ; SELECT FROM users WHERE password = ' pass: abc SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = ' ' ; DELETE FROM users WHERE user = 'abc'; SELECT FROM users WHERE password = ' ' AND password = 'abc'; SELECT * FROM users WHERE user = ' ' ; DELETE FROM users WHERE user = 'abc'; SELECT FROM users WHERE password = ' ' AND password = 'abc'; SELECT * FROM users WHERE user = ''; DELETE FROM users WHERE user = 'abc'; SELECT FROM users WHERE password = '' AND password = 'abc'; SELECT * FROM users WHERE user = ''; DELETE FROM users WHERE user = 'abc'; SELECT FROM users WHERE password = '' AND password = 'abc';
18 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Consider another input: user: ' ; DROP TABLE users; SELECT FROM users WHERE password = ' pass: abc SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = ' ' ; DROP TABLE users; SELECT FROM users WHERE password = ' ' AND password = 'abc'; SELECT * FROM users WHERE user = ' ' ; DROP TABLE users; SELECT FROM users WHERE password = ' ' AND password = 'abc'; SELECT * FROM users WHERE user = ''; DROP TABLE users; SELECT FROM users WHERE password = '' AND password = 'abc'; SELECT * FROM users WHERE user = ''; DROP TABLE users; SELECT FROM users WHERE password = '' AND password = 'abc'; Multi-command inj. attacks (other DBs)
19 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Consider another input: user: ' ; SHUTDOWN WITH NOWAIT; SELECT FROM users WHERE password = ' pass: abc SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = ' ' ; SHUTDOWN WITH NOWAIT; SELECT FROM users WHERE password = ' ' AND password = 'abc'; SELECT * FROM users WHERE user = ' ' ; SHUTDOWN WITH NOWAIT; SELECT FROM users WHERE password = ' ' AND password = 'abc'; SELECT * FROM users WHERE user = ''; SHUTDOWN WITH NOWAIT; SELECT FROM users WHERE password = '' AND password = 'abc'; SELECT * FROM users WHERE user = ''; SHUTDOWN WITH NOWAIT; SELECT FROM users WHERE password = '' AND password = 'abc'; Multi-command inj. attacks (other DBs)
20 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Injection attacks – MySQL/Perl/PHP Consider another input: user: your-boss pass: ' OR 1=1 AND user = 'your-boss Delete your boss! DELETE FROM users WHERE user = u AND password = p; DELETE FROM users WHERE user = u AND password = p; DELETE FROM users WHERE user = 'your-boss' AND pass = ' ' OR 1=1 AND user = ' your-boss'; DELETE FROM users WHERE user = 'your-boss' AND pass = ' ' OR 1=1 AND user = ' your-boss'; DELETE FROM users WHERE user = 'your-boss' AND pass = '' OR 1=1 AND user = 'your-boss'; DELETE FROM users WHERE user = 'your-boss' AND pass = '' OR 1=1 AND user = 'your-boss';
21 Matthew P. Johnson, OCL3, CISDD CUNY, June Injection attacks – MySQL/Perl/PHP Consider another input: user: ' OR 1=1 OR user = ' pass: ' OR 1=1 OR user = ' Delete everyone! DELETE FROM users WHERE user = u AND pass = p; DELETE FROM users WHERE user = u AND pass = p; DELETE FROM users WHERE user = ' ' OR 1=1 OR user = ' ' AND pass = ' ' OR 1=1 OR user = ' '; DELETE FROM users WHERE user = ' ' OR 1=1 OR user = ' ' AND pass = ' ' OR 1=1 OR user = ' '; DELETE FROM users WHERE user = '' OR 1=1 OR user = '' AND pass = '' OR 1=1 OR user = ''; DELETE FROM users WHERE user = '' OR 1=1 OR user = '' AND pass = '' OR 1=1 OR user = '';
22 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Preventing injection attacks Ultimate source of problem: quotes Soln 1: don’t allow quotes! Reject any entered data containing single quotes Q: Is this satisfactory? Does Amazon need to sell O’Reilly books? Soln 2: escape any single quotes Replace any ' with a '' or \' In Perl, use taint mode – won’t show In PHP, turn on magic_quotes_gpc flag in.htaccess show both PHP versions
23 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Preventing injection attacks Soln 3: use prepare parameter-based queries Supported in JDBC, Perl DBI, PHP ext/mysqli Very dangerous: using tainted data to run commands at the Unix command prompt Semi-colons, prime char, etc. Safest: define set if legal chars, not illegal ones
24 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Preventing injection attacks When to do security checking for quotes, etc.? Natural choice: in client-side data validation But not enough! As saw earlier: can submit GET and POST params manually Must do security checking on server
25 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 More Info phpGB MySQL Injection Vulnerability "How I hacked PacketStorm“
26 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 SQL*Plus settings SQL> SET RECSEP OFF SQL> COLUMN text FORMAT A60 SQL> SET RECSEP OFF SQL> COLUMN text FORMAT A60
27 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 New topic: Regular Expressions In automata theory, Finite Automata are the simplest weakest of computer, Turing Machines the strongest Chomsky’s Hierarchy FA are equivalent to a regular expression Expressions that specify a pattern Can check whether a string matches the pattern
28 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 RegEx matching Use REGEX_LIKE Metachar for any char is. First, get employee_comment table: Now do search: So far, like LIKE SELECT emp_id, text FROM employee_comment WHERE REGEXP_LIKE(text,' '); SELECT emp_id, text FROM employee_comment WHERE REGEXP_LIKE(text,' ');
29 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 RegEx matching Can also pull out the matching text with REGEXP_SUBSTR: If want only numbers, can specify a set of chars rather than a dot: SELECT emp_id, REGEXP_SUBSTR(text,' ') text FROM employee_comment WHERE REGEXP_LIKE(text,' '); SELECT emp_id, REGEXP_SUBSTR(text,' ') text FROM employee_comment WHERE REGEXP_LIKE(text,' '); SELECT emp_id, REGEXP_SUBSTR(text, '[ ]..-...[ ]') text FROM employee_comment WHERE REGEXP_LIKE(text, '[ ]..-...[ ]'); SELECT emp_id, REGEXP_SUBSTR(text, '[ ]..-...[ ]') text FROM employee_comment WHERE REGEXP_LIKE(text, '[ ]..-...[ ]');
30 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 RegEx matching Or can specify a range of chars: Or, finally, can state how many copies to match: SELECT emp_id, REGEXP_SUBSTR(text, '[0-9] ') text FROM employee_comment WHERE REGEXP_LIKE(text,' '); SELECT emp_id, REGEXP_SUBSTR(text, '[0-9] ') text FROM employee_comment WHERE REGEXP_LIKE(text,' '); SELECT emp_id, REGEXP_SUBSTR(text, '[0-9]{3}-[0-9]{4}') text FROM employee_comment WHERE REGEXP_LIKE(text,'[0-9]{3}-[0-9]{4}'); SELECT emp_id, REGEXP_SUBSTR(text, '[0-9]{3}-[0-9]{4}') text FROM employee_comment WHERE REGEXP_LIKE(text,'[0-9]{3}-[0-9]{4}');
31 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 RegExp matching Other operators: * - 0 or more matches or more matches ? - 0 or 1 match Also, can OR options together with | op Here: some phone nums have area codes, some not, so want to match both: SELECT emp_id, REGEXP_SUBSTR(text, '[0-9]{3}-[0-9]{3}-[0-9]{4}|[0-9]{3}- [0-9]{4}') text FROM employee_comment WHERE REGEXP_LIKE(text,'[0-9]{3}-[0-9]{3}- [0-9]{4}|[0-9]{3}-[0-9]{4}'); SELECT emp_id, REGEXP_SUBSTR(text, '[0-9]{3}-[0-9]{3}-[0-9]{4}|[0-9]{3}- [0-9]{4}') text FROM employee_comment WHERE REGEXP_LIKE(text,'[0-9]{3}-[0-9]{3}- [0-9]{4}|[0-9]{3}-[0-9]{4}');
32 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 RegExp matching Order of ORed together patterns matters: First matching pattern wins SELECT emp_id, REGEXP_SUBSTR(text, '[0-9]{3}-[0-9]{4}|[0-9]{3}-[0-9]{3}- [0-9]{4}') text FROM employee_comment WHERE REGEXP_LIKE(text,'[0-9]{3}-[0- 9]{4}|[0-9]{3}-[0-9]{3}-[0-9]{4}'); SELECT emp_id, REGEXP_SUBSTR(text, '[0-9]{3}-[0-9]{4}|[0-9]{3}-[0-9]{3}- [0-9]{4}') text FROM employee_comment WHERE REGEXP_LIKE(text,'[0-9]{3}-[0- 9]{4}|[0-9]{3}-[0-9]{3}-[0-9]{4}');
33 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 RegExp matching There’s a shared structure between the two, tho Area code is just optional Can use ? op SELECT emp_id, REGEXP_SUBSTR(text, '([0-9]{3}-)?[0-9]{3}-[0-9]{4}') text FROM employee_comment WHERE REGEXP_LIKE(text,'([0-9]{3}-)?[0- 9]{3}-[0-9]{4}'); SELECT emp_id, REGEXP_SUBSTR(text, '([0-9]{3}-)?[0-9]{3}-[0-9]{4}') text FROM employee_comment WHERE REGEXP_LIKE(text,'([0-9]{3}-)?[0- 9]{3}-[0-9]{4}');
34 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 RegExp matching Also, different kinds of separators: dash, dot, just blank Can OR together whole number patterns Better: Just use set of choices of each sep. SELECT emp_id, REGEXP_SUBSTR(text, '([0- 9]{3}[-. ])?[0-9]{3}[-. ][0-9]{4}') text FROM employee_comment WHERE REGEXP_LIKE(text,'([0-9]{3}[-. ])?[0- 9]{3}[-. ][0-9]{4}'); SELECT emp_id, REGEXP_SUBSTR(text, '([0- 9]{3}[-. ])?[0-9]{3}[-. ][0-9]{4}') text FROM employee_comment WHERE REGEXP_LIKE(text,'([0-9]{3}[-. ])?[0- 9]{3}[-. ][0-9]{4}');
35 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 RegExp matching One other thing: area codes in parentheses Of course, area codes are still optional Parentheses must be escaped - \( \) SELECT emp_id, REGEXP_SUBSTR(text, '([0- 9]{3}[-. ]|\([0-9]{3}\) )?[0-9]{3}[-. ][0- 9]{4}') text FROM employee_comment WHERE REGEXP_LIKE(text,'([0-9]{3}[-. ]|\([0- 9]{3}\) )?[0-9]{3}[-. ][0-9]{4}'); SELECT emp_id, REGEXP_SUBSTR(text, '([0- 9]{3}[-. ]|\([0-9]{3}\) )?[0-9]{3}[-. ][0- 9]{4}') text FROM employee_comment WHERE REGEXP_LIKE(text,'([0-9]{3}[-. ]|\([0- 9]{3}\) )?[0-9]{3}[-. ][0-9]{4}');
36 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 And now for something completely different: XML XML: eXtensible Mark-up Language Very popular language for semi-structured data Mark-up language: consists of elements composed of tags, like HTML Emerging lingua franca of the Internet, Web Services, inter-vender comm
37 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Unstructured data At one end of continuum: unstructured data Text files Stock market prices CIA intelligence intercepts Audio recordings “Just one damn bit after another” ~ Henry Ford No (intentional, formal) patterns to the data Difficult to manage/make sense of Why we need data-mining
38 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Structured data At the other end: structured data Tables in RDBMSs Data organized into semantic chunks entities Similar/related entities grouped together Relationships, classes Entities in same group have same structure Same fields/attributes/properties Easy to make sense of But sometimes too rigid a req. Difficult to send—convert to tab-delimited
39 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Semi-structured data Not too random Data organized into entities Similar/related grouped to form other entities Not too structured Some attributes may be missing Size of attributes may vary Support of lists/sets Juuust Right Data is self-describing
40 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Semi-structured data Predominant examples: HTML: HyperText Mark-up Language XML: eXtensible Mark-up Language NB: both mark-up languages (use tags) Mark-up lends self of semi-structured data Demarcate boundaries for entities But freely allow other entities inside
41 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Data model for semi-structured data Usually represented as directed graphs Graph: set of vertices (nodes) and edges Dots connected by lines; not nec. a tree! In model, Nodes ~ entities or fields/attributes Edges ~ attribute-of/sub-entity-of Example: publisher publishes >=0 books Each book has one title, one year, >=1 authors Draw publishers graph
42 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 XML is a SSD language Standard published by W3C Officially announced/recommended in 1998 XML != HTML XML != a replacement for HTML Both are mark-up languages Big diffs: 1. XML doesn’t use predefined tags (!) But it’s extensible: tags can be added 2. HTML is about presentation:,, XML is about content:,
43 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 XML syntax Like HTML in many respects but more strict All tags must be closed Can’t have: this is a line Every start tag has an end tag Although style can replace both IS case-sensitive IS space-sensitive XML doc has a unique root element
44 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 XML syntax Tags must be properly nested Not allowed I’m not kidding Intuition: file folders Elements may have quoted attributes … Comments same as in HTML: Draw publishers XML
45 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Escape chars in XML Some chars must be escaped Distinguish content from syntax Can also declare value to be pure text: >< <> && "" '' jsdljsd <>>]]> 3 < 5 "Don't call me 'Ishmael'!"
46 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 XML Namespaces Different schemas/DTDs may overlap XHTML and MathML share some tags Soln: namespaces as in Java/C++/C#
47 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Michael 123 Hilary 456 Bill 789 Michael 123 Hilary 456 Bill 789 row name ssn “Michael”123“Hilary”“Bill” persons XML: persons From Relational Data to XML Data NameSSNMailing-address Michael123NY Hilary456DC Bill789Chappaqua
48 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Semi-structured Data Explained List-valued attributes XML is not 1NF! Impossible in (single, BCNF) tables: two phones! namephone Bill ??? Hilary Bill Hilary Bill
49 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Object ids and References SSD graph might not be trees! But XML docs must be Would cause much redundancy Soln: same concept as pointers in C/C++/J Object ids and references Graph example: Movies: Lost in Translation, Hamlet Stars: Bill Murray, Scarlet Johansson Lost in Translation 2003 Hamlet 1999 Bill Murray Lost in Translation 2003 Hamlet 1999 Bill Murray
50 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 What do we do with XML? Things done with XML: Send to partners Parse XML received Convert to RDBMS rows Query for particular data Convert to other XML Convert to formats other than XML Lots of tools/standards for these…
51 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 DTDs & understanding XML XML is extensible Advantage: when creating, we can use any tags we like Disadv: when reading, they can use any tags they like Using XML docs a priori is very difficult Solution: impose some constraints
52 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 DTDs DTD: Document Type Definition You and partners/vertical industry/academic discipline decide on a DTD/schema for your docs Specify which entities you may use/must understand Specify legal relationships DTD specifies the grammar to be used DTD = set of rules for creating valid entities DTD tells your software what to look for in doc
53 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 DTD examples Well-formed XML v. valid XML Simple example: Copy from: Partial publisher example rules: Root publisher Publisher name, book*, author* Book title, date, author+ Author firstname, middlename?, lastname
54 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Partial DTD example (typos!) <!DOCTYPE PUBLISHER [ <!DOCTYPE PUBLISHER [ DTD is not XML, but can be embedded in or ref.ed from XML Replacement for DTDs is XML Schema
55 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 XML Applications/dialects MathML: Mathematical Markup Language ations/ictp99/ictp99N8059.html ations/ictp99/ictp99N8059.html VoiceXML: es/rps.xml es/rps.xml ChemML: Chemical Markup Language XHMTL: HTML retrofitted as an XML application
56 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 XML Applications/dialects VoiceXML: AT&T Directory Assistance Image from
57 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 More XML Apps FIXML XML equiv. of FIX: Financial Information eXchange swiftML XML equiv. of SWIFT: Society for Worldwide Interbank Financial Telecommunications message format Apache’s Ant Scripting language for Java build management Many more:
58 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 More XML Applications/Protocols RSS: Rich Site Summary/Really Simple Syndication News sites, blogs… Screenshot More info: my channel story 1 … // other items my channel story 1 … // other items
59 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 More XML Applications/Protocols SOAP: Simple Object Access Protocol XML-based messaging format Used by Google API: Amazon API: Amazon light: Other examples: 10&topic=&topic_set= 10&topic=&topic_set SOAP envelope with header and body Request sales tax for total <SOAP:Envelope xmlns:SOAP="urn:schemas-xmlsoap-org:soap.v1"> 100 <SOAP:Envelope xmlns:SOAP="urn:schemas-xmlsoap-org:soap.v1"> 100
60 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 More XML Applications/Protocols %(key)s 0 10 true false %(key)s 0 10 true false
61 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 New topic: XML in Oracle - purchase-order e.g Alpha Tech 11257> AI EI-T Alpha Tech 11257> AI EI-T
62 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Storing XML data As of 9i, has XMLType data type By default, underlying storage is as CLOB CREATE TABLE purchase_order( po_id number(5) not null, customer_po_nbr varchar(20), customer_inception_date date, order_nbr number(5), purchase_order_doc xmltype, constraint purchase_order_pk primary key(po_id) ); CREATE TABLE purchase_order( po_id number(5) not null, customer_po_nbr varchar(20), customer_inception_date date, order_nbr number(5), purchase_order_doc xmltype, constraint purchase_order_pk primary key(po_id) );
63 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Loading XML into Oracle First, log in as sys: Now scott can import: connect sys/junk as sysdba create directory xml_data as '/xml'; grant read, write on directory xml_data to scott; connect sys/junk as sysdba create directory xml_data as '/xml'; grant read, write on directory xml_data to scott; connect scott/tiger declare bf1 bfile; begin bf1 := bfilename('XML_DATA', 'purch_ord.xml'); insert into purchase_order(po_id, purchase_order_doc) values(1000, xmltype(bf1, nls_charset_id('we8mswin1252'))); end; connect scott/tiger declare bf1 bfile; begin bf1 := bfilename('XML_DATA', 'purch_ord.xml'); insert into purchase_order(po_id, purchase_order_doc) values(1000, xmltype(bf1, nls_charset_id('we8mswin1252'))); end;
64 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Loading XML into Oracle Not just loading raw text XMLType data must be well-formed Parsable as XML Try modifying customer_name open tag
65 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Accessing XML in Oracle Now can look at raw XML: Can also use XPath to extract particular nodes and values, with extract function: SQL> SELECT purchase_order_doc FROM purchase_order; SQL> SELECT purchase_order_doc FROM purchase_order; SQL> SELECT extract(purchase_order_doc, '/purchase_order/customer_name') FROM purchase_order; SQL> SELECT extract(purchase_order_doc, '/purchase_order/customer_name') FROM purchase_order;
66 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 XPath in Oracle Can also extract all nodes of one type, underneath some node, with double-slash // All purchase order items NB: this is not valid XML No unique root Can request just one with bracket op Numbering starts at 1, not 0 Wrong name/number no error, no results SQL> SELECT extract(purchase_order_doc, '/purchase_order/po_items/item[2]') FROM purchase_order; SQL> SELECT extract(purchase_order_doc, '/purchase_order/po_items/item[2]') FROM purchase_order; SQL> SELECT extract(purchase_order_doc, '/purchase_order//item') FROM purchase_order; SQL> SELECT extract(purchase_order_doc, '/purchase_order//item') FROM purchase_order;
67 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 extract v. extractvalue extractvalue returns value, not whole node: vs. extractvalue applies only to unique nodes: SQL> SELECT extractvalue(purchase_order_doc, '/purchase_order/customer_name') FROM purchase_order; SQL> SELECT extractvalue(purchase_order_doc, '/purchase_order/customer_name') FROM purchase_order; SQL> SELECT extract(purchase_order_doc, '/purchase_order/customer_name') FROM purchase_order; SQL> SELECT extract(purchase_order_doc, '/purchase_order/customer_name') FROM purchase_order; SQL> SELECT extractvalue(purchase_order_doc, '/purchase_order/po_items') FROM purchase_order; SQL> SELECT extractvalue(purchase_order_doc, '/purchase_order/po_items') FROM purchase_order;
68 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 existsnode function Can check whether node/location exists with existnode function Returns 1 or 0 Also applies to bracketed paths: SQL> SELECT po_id FROM purchase_order WHERE existsnode(purchase_order_doc, '/purchase_order/customer_name') = 1; SQL> SELECT po_id FROM purchase_order WHERE existsnode(purchase_order_doc, '/purchase_order/customer_name') = 1; SQL> SELECT po_id FROM purchase_order WHERE existsnode(purchase_order_doc, '/purchase_order/po_items/item[1]') = 1; SQL> SELECT po_id FROM purchase_order WHERE existsnode(purchase_order_doc, '/purchase_order/po_items/item[1]') = 1;
69 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Moving data from XML to relations To move single values from XML to tables, can simply use extractvalue in UPDATE statements: SQL> UPDATE purchase_order SET order_nbr = 7101, customer_po_nbr = extractvalue(purchase_order_doc, '/purchase_order/po_number'), customer_inception_date = to_date(extractvalue(purchase_order_doc, '/purchase_order/po_date'), 'yyyy-mm-dd'); SQL> UPDATE purchase_order SET order_nbr = 7101, customer_po_nbr = extractvalue(purchase_order_doc, '/purchase_order/po_number'), customer_inception_date = to_date(extractvalue(purchase_order_doc, '/purchase_order/po_date'), 'yyyy-mm-dd');
70 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Moving data from XML to relations What about moving set of nodes The two item nodes Use xmlsequence to get a varray of items Use TABLE to convert to a relation SQL> SELECT extract(purchase_order_doc, '/purchase_order//item') FROM purchase_order; SQL> SELECT extract(purchase_order_doc, '/purchase_order//item') FROM purchase_order; SQL> SELECT rownum, item.* FROM TABLE( SELECT xmlsequence(extract(purchase_order_doc, '/purchase_order//item')) FROM purchase_order) item; SQL> SELECT rownum, item.* FROM TABLE( SELECT xmlsequence(extract(purchase_order_doc, '/purchase_order//item')) FROM purchase_order) item;
71 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Moving data from XML to relations Result is a two-row relation with XMLTypes Can use extractvalue to extract this data First, create destination table: CREATE TABLE LINE_ITEM( ORDER_NBR NUMBER(9) NOT NULL, PART_NBR VARCHAR2(20) NOT NULL, QTY NUMBER(5) NOT NULL, FILLED_QTY NUMBER(5), CONSTRAINT line_item_pk PRIMARY KEY (ORDER_NBR,PART_NBR)); CREATE TABLE LINE_ITEM( ORDER_NBR NUMBER(9) NOT NULL, PART_NBR VARCHAR2(20) NOT NULL, QTY NUMBER(5) NOT NULL, FILLED_QTY NUMBER(5), CONSTRAINT line_item_pk PRIMARY KEY (ORDER_NBR,PART_NBR));
72 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Moving data from XML to relations Then insert results: SQL> INSERT INTO line_item(order_nbr,part_nbr,qty) SELECT 7109, extractvalue(column_value, '/item/part_number'), extractvalue(column_value, '/item/quantity') FROM TABLE( SELECT xmlsequence(extract(purchase_order_doc, '/purchase_order//item')) FROM purchase_order ); SQL> INSERT INTO line_item(order_nbr,part_nbr,qty) SELECT 7109, extractvalue(column_value, '/item/part_number'), extractvalue(column_value, '/item/quantity') FROM TABLE( SELECT xmlsequence(extract(purchase_order_doc, '/purchase_order//item')) FROM purchase_order );
73 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 XML Schemas and Oracle By default, XML must be well-formed to be read into the XMLType field XML is valid if it conforms to a schema To use a schema with Oracle, must first register it: declare bf1 bfile; begin bf1 := bfilename('XML_DATA', 'purch_ord.xsd'); dbms_xmlschema.registerschema( ' /schemas/purch_ord.xsd', bf1); end; declare bf1 bfile; begin bf1 := bfilename('XML_DATA', 'purch_ord.xsd'); dbms_xmlschema.registerschema( ' /schemas/purch_ord.xsd', bf1); end;
74 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 XML Schemas and Oracle With schema registered, can apply it to an XMLType field CREATE TABLE purchase_order2 (po_id NUMBER(5) NOT NULL, customer_po_nbr VARCHAR2(20), customer_inception_date DATE, order_nbr NUMBER(5), purchase_order_doc XMLTYPE, CONSTRAINT purchase_order2_pk PRIMARY KEY (po_id)) XMLTYPE COLUMN purchase_order_doc XMLSCHEMA " ch_ord.xsd" ELEMENT "purchase_order"; CREATE TABLE purchase_order2 (po_id NUMBER(5) NOT NULL, customer_po_nbr VARCHAR2(20), customer_inception_date DATE, order_nbr NUMBER(5), purchase_order_doc XMLTYPE, CONSTRAINT purchase_order2_pk PRIMARY KEY (po_id)) XMLTYPE COLUMN purchase_order_doc XMLSCHEMA " ch_ord.xsd" ELEMENT "purchase_order";
75 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Importing to schema field Try to import xml file, get error: declare bf1 bfile; begin bf1 := bfilename('XML_DATA', 'purch_ord.xml'); insert into purchase_order2(po_id, purchase_order_doc) values (2000, XMLTYPE(bf1, nls_charset_id('WE8MSWIN1252'))); end; declare bf1 bfile; begin bf1 := bfilename('XML_DATA', 'purch_ord.xml'); insert into purchase_order2(po_id, purchase_order_doc) values (2000, XMLTYPE(bf1, nls_charset_id('WE8MSWIN1252'))); end;
76 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Importing to schema field Root node of XML must specify the schema Change root to the following: Now can import Also fails if extra or missing nodes Modify company_name node Add new comments node <purchase_order xmlns:xsi=" xsi:noNamespaceSchemaLocation=" ome/xml/schemas/purch_ord.xsd"> <purchase_order xmlns:xsi=" xsi:noNamespaceSchemaLocation=" ome/xml/schemas/purch_ord.xsd">
77 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Can check to see whether schema is used Can call isSchemaBased(), getSchemaURL() and isSchemaValid() on XMLType fields: SQL> select po.purchase_order_doc.isSchemaBased(), po.purchase_order_doc.getSchemaURL(), po.purchase_order_doc.isSchemaValid() from purchase_order2 po; SQL> select po.purchase_order_doc.isSchemaBased(), po.purchase_order_doc.getSchemaURL(), po.purchase_order_doc.isSchemaValid() from purchase_order2 po;
78 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Updating XMLType data Can update XMLType data with ordinary UPDATE statements: Replaces whole XMLType object with new one SQL> UPDATE purchase_order po SET po.purchase_order_doc = XMLTYPE(BFILENAME('XML_DATA', 'purch_ord_alt.xml'), nls_charset_id('WE8MSWIN1252')) WHERE po.po_id = 2000; SQL> UPDATE purchase_order po SET po.purchase_order_doc = XMLTYPE(BFILENAME('XML_DATA', 'purch_ord_alt.xml'), nls_charset_id('WE8MSWIN1252')) WHERE po.po_id = 2000;
79 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Updating XMLType data Can also modify the existing XMLType object By writing node values updateXML() function does search/replace But searches for node, not value SQL> SELECT extract(po.purchase_order_doc, '/purchase_order/customer_name') FROM purchase_order po WHERE po_id = 1000; SQL> UPDATE purchase_order po SET po.purchase_order_doc = updateXML(po.purchase_order_doc, '/purchase_order/customer_name/text()', 'some other company') WHERE po.po_id = 1000; SQL> SELECT extract(po.purchase_order_doc, '/purchase_order/customer_name') FROM purchase_order po WHERE po_id = 1000; SQL> UPDATE purchase_order po SET po.purchase_order_doc = updateXML(po.purchase_order_doc, '/purchase_order/customer_name/text()', 'some other company') WHERE po.po_id = 1000;
80 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Updating XMLType data Can also write whole node, using XMLType: Validation/well-formedness is still checked SQL> UPDATE purchase_order po SET po.purchase_order_doc = updateXML(po.purchase_order_doc, '/purchase_order/customer_name', XMLTYPE(' some third company ')) WHERE po.po_id = 1000; SQL> SELECT extract(po.purchase_order_doc, '/purchase_order/customer_name') FROM purchase_order po WHERE po_id = 1000; SQL> UPDATE purchase_order po SET po.purchase_order_doc = updateXML(po.purchase_order_doc, '/purchase_order/customer_name', XMLTYPE(' some third company ')) WHERE po.po_id = 1000; SQL> SELECT extract(po.purchase_order_doc, '/purchase_order/customer_name') FROM purchase_order po WHERE po_id = 1000;
81 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Updating XMLType data And can update items in a collection: SQL> SELECT extract(po.purchase_order_doc, '/purchase_order//item') FROM purchase_order po WHERE po.po_id = 1000; SQL> UPDATE purchase_order po SET po.purchase_order_doc = updateXML(po.purchase_order_doc, '/purchase_order/po_items/item[1]', XMLTYPE(' T ')) WHERE po.po_id = 1000; SQL> SELECT extract(po.purchase_order_doc, '/purchase_order//item') FROM purchase_order po WHERE po.po_id = 1000; SQL> UPDATE purchase_order po SET po.purchase_order_doc = updateXML(po.purchase_order_doc, '/purchase_order/po_items/item[1]', XMLTYPE(' T ')) WHERE po.po_id = 1000;
82 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Converting relational data to XML Saw how to put XML in a table Conversely, can convert ordinary relational data to XML XMLElement() generates an XML node First, create supplier table: CREATE TABLE SUPPLIER( SUPPLIER_ID NUMBER(5) NOT NULL, NAME VARCHAR2(30) NOT NULL, PRIMARY KEY (SUPPLIER_ID)); insert into supplier values(1, 'Acme'); insert into supplier values(2, 'Tilton'); insert into supplier values(3, 'Eastern'); CREATE TABLE SUPPLIER( SUPPLIER_ID NUMBER(5) NOT NULL, NAME VARCHAR2(30) NOT NULL, PRIMARY KEY (SUPPLIER_ID)); insert into supplier values(1, 'Acme'); insert into supplier values(2, 'Tilton'); insert into supplier values(3, 'Eastern');
83 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Converting relational data to XML Now can call XMLElement function to wrap values in tags: And can build it up: Don’t concatenate! Turns to strings, escapes Error in book SELECT XMLElement("supplier_id", s.supplier_id) || XMLElement("name", s.name) xml_fragment FROM supplier s; SELECT XMLElement("supplier_id", s.supplier_id) || XMLElement("name", s.name) xml_fragment FROM supplier s; SELECT XMLElement("supplier", XMLElement("supplier_id", s.supplier_id), XMLElement("name", s.name)) FROM supplier s; SELECT XMLElement("supplier", XMLElement("supplier_id", s.supplier_id), XMLElement("name", s.name)) FROM supplier s;
84 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 XMLForest() More simply, can use XMLForest() function: SELECT XMLElement("supplier", XMLForest(s.supplier_id, s.name)) FROM supplier s; SELECT XMLElement("supplier", XMLForest(s.supplier_id, s.name)) FROM supplier s;
85 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 XMLAgg() Can use XMLAgg() to put nodes together inside another node: SELECT XMLElement("supplier_list", XMLAgg(XMLElement("supplier", XMLElement("supplier_id", s.supplier_id), XMLElement("name", s.name) ))) xml_document FROM supplier s; SELECT XMLElement("supplier_list", XMLAgg(XMLElement("supplier", XMLElement("supplier_id", s.supplier_id), XMLElement("name", s.name) ))) xml_document FROM supplier s;
86 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 New topic: Data Warehousing Physical warehouse: stores different kinds of items combined from different sources in supply chain access items as a combined package “Synergy” DW is the sys containing the data from many DBs OLAP is the system for easily querying the DW Online analytical processing front-end to DW & stats
87 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Integrating Data Ad hoc combination of DBs from different sources can be problematic Data may be spread across many systems geographically by division different systems from before mergers…
88 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Conversion/scrubbing/merging Lots of issues… different types of data Varchar(255) v. char(30) Different values for data ‘GREEN’/’GR/’2 Semantic differences Cars v. Automobiles Missing values Handle with nulls or XML
89 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Federated DBs Situ: n different DBs must work together One idea: write programs for each to talk to each other one How many programs required? Like ambassadors for each country
90 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Federated DBs Better idea: introduce another DB write programs for it to talk to each other DB Now how many programs? English in business, French in diplomacy Warehousing Refreshed nightly
91 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OLTP v. OLAP DWs usually not updated in real-time data is usually not live but care about higher-level, longer-term patterns For “knowledge workers”/decision-makers Live data is in system used by OLTP online transaction processing E.g., airline reservations OLTP data loaded into DW periodically, say nightly
92 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Utilizing Data Situ: each time manager has hunch requests custom reports direct programmers to write/modify SQL app to produce these results on higher or lower levels, for different specifics Problem: too difficult/expensive/slow too great a time lag
93 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 EISs Could just write queries at command-prompt But decision makes aren’t (all) SQL programmers Soln: create an executive information system provides friendly front-end to common, important queries basically a simple DB front-end your project part 5 GROUP BY queries are particularly applicable…
94 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 EISs v. OLAP Okay for fixed set of queries But what if queries are open-ended? Q: What’s driving sales in the Northeast? What’s the source cause? Result from one query influences next query tried OLAP systems are interactive: run query analyze results think of new query repeat
95 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Star Schemas Popular schema for DW data One central DB surrounded by specific DBs Center: fact table Extremities: data tables Fields in fact table are foreign keys to data tables Normalization Snowflake Schema May not be worthwhile…
96 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Dates and star schemas OLAP behaves as though you had a Days table, with every possible row Dates(day, week, month, year, DID) (5, 27, 7, 2000) Can join on Days like any other table
97 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Dates and star schemas E.g.: products x salesperson x region x date Products sold by salespeople in regions on dates Regular dim tables: Product(PID, name, color) Emp(name, SSN, sal) Region(name, RID) Fact table: Sales(PID, DID, SSN, RID) Interpret as a cube (cross product of all dimensions) Can have both data and stats
98 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Drill-down & roll-up Imagine: notice some region’s sales way up Why? Good salesperson? Some popular product there? Maybe need to search by month, or month and product, abstract back up to just product… “slicing & dicing”
99 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OLAP and data warehousing Could write GROUP BY queries for each OLAP systems provide simpler, non-SQL interface for this sort of thing Vendors: MicroStrategy, SAP, etc. Otoh: DW-style operators have been added to SQL and some DBMSs…
100 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 DW extensions in SQL: ROLLUP (Oracle) Suppose have orders table (from two years), with region and date info: Can select total sales: Examples derived/from Mastering Oracle SQL, 2e (O’Reilly) Get data here: SELECT sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id; SELECT sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id; SQL> column month format a10 SQL> describe all_orders SQL> column month format a10 SQL> describe all_orders
101 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Can write GROUP BY queries for year or region or both: SELECT r.name region, o.year, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY (r.name, o.year); SELECT r.name region, o.year, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY (r.name, o.year); DW extensions in SQL: ROLLUP (Oracle)
102 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 ROLLUP operator Extension of GROUP BY Does GROUP BY on several levels, simultaneously Order matters Get sales totals for each region/year pair each region, and the grand total: SELECT r.name region, o.year, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY ROLLUP (r.name, o.year); SELECT r.name region, o.year, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY ROLLUP (r.name, o.year); DW extensions in SQL: ROLLUP (Oracle)
103 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Change the order of the group fields to get a different sequence of groups To get totals for each year/region pair, each year, and the grand total, and just reverse group-by order: SELECT o.year, r.name region, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY ROLLUP (o.year, r.name); SELECT o.year, r.name region, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY ROLLUP (o.year, r.name); DW extensions in SQL: ROLLUP (Oracle)
104 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Adding more dimensions, like month, is easy (apart from formatting): NB: summing happens on each level SELECT o.year, to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY ROLLUP (o.year, o.month, r.name); SELECT o.year, to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY ROLLUP (o.year, o.month, r.name); DW extensions in SQL: ROLLUP (Oracle)
105 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 If desired, can combine fields for the sake of grouping: DW extensions in SQL: ROLLUP (Oracle) SELECT o.year, to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY ROLLUP ((o.year, o.month), r.name); SELECT o.year, to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY ROLLUP ((o.year, o.month), r.name);
106 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 DW extensions in SQL: CUBE (Oracle) Another GROUP BY extension: CUBE Subtotals all possible combins of group-by fields (powerset) Syntax: “ROLLUP” “CUBE” Order of fields doesn’t matter (apart from ordering) To get subtotals for each region/month pair, each region, each month, and the grand total: SELECT to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY CUBE (o.month, r.name); SELECT to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY CUBE (o.month, r.name);
107 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 DW extensions in SQL: CUBE (Oracle) Again, can easily add more dimensions: SELECT o.year, to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY CUBE (o.year, o.month, r.name); SELECT o.year, to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY CUBE (o.year, o.month, r.name);
108 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 DW SQL exts: GROUPING SETS (Oracle) That’s a lot of rows Instead of a cube of all combinations, maybe we just want the totals for each individual field: SELECT o.year, to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY GROUPING SETS (o.year, o.month, r.name); SELECT o.year, to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY GROUPING SETS (o.year, o.month, r.name);
109 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Next Final evals More lab…
110 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 That’s all, folks! Selected solutions to exercises: sqlzoo ~ “Answers” on sqlzoo.net PL/SQL ~ archive/fall04/plsql/ archive/fall04/plsql/ mpjohnson-at-gmail.com