1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005.

Slides:



Advertisements
Similar presentations
XML-XSL Introduction SHIJU RAJAN SHIJU RAJAN Outline Brief Overview Brief Overview What is XML? What is XML? Well Formed XML Well Formed XML Tag Name.
Advertisements

CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
Let’s try Oracle. Accessing Oracle The Oracle system, like the SQL Server system, is client / server. For SQL Server, –the client is the Query Analyser.
1 Lecture 10 XML Wednesday, October 18, XML Outline XML (4.6, 4.7) –Syntax –Semistructured data –DTDs.
Introduction to Structured Query Language (SQL)
M.P. Johnson, DBMS, Stern/NYU, Spring C : Database Management Systems Lecture #18 M.P. Johnson Stern School of Business, NYU Spring, 2008.
M.P. Johnson, DBMS, Stern/NYU, Spring C : Database Management Systems Lecture #23 M.P. Johnson Stern School of Business, NYU Spring, 2005.
Database Management: Getting Data Together Chapter 14.
M.P. Johnson, DBMS, Stern/NYU, Spring C : Database Management Systems Lecture #19 M.P. Johnson Stern School of Business, NYU Spring, 2008.
Matthew P. Johnson, OCL1, CISDD CUNY, F20041 OCL1 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY Fall, 2004.
M.P. Johnson, DBMS, Stern/NYU, Sp20041 C : Database Management Systems Lecture #23 Matthew P. Johnson Stern School of Business, NYU Spring, 2004.
M.P. Johnson, DBMS, Stern/NYU, Sp20041 C : Database Management Systems Lecture #22 Matthew P. Johnson Stern School of Business, NYU Spring, 2004.
System Administration Accounts privileges, users and roles
M.P. Johnson, DBMS, Stern/NYU, Spring C : Database Management Systems Lecture #24 M.P. Johnson Stern School of Business, NYU Spring, 2005.
1Matthew P. Johnson, OCL2, CISDD CUNY, January 2005 OCL2 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005.
Introduction to Structured Query Language (SQL)
M.P. Johnson, DBMS, Stern/NYU, Spring C : Database Management Systems Lecture #25 M.P. Johnson Stern School of Business, NYU Spring, 2005.
Document Type Definitions. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:
1 Advanced Topics XML and Databases. 2 XML u Overview u Structure of XML Data –XML Document Type Definition DTD –Namespaces –XML Schema u Query and Transformation.
2440: 141 Web Site Administration Web Server-Side Programming Professor: Enoch E. Damson.
4/20/2017.
Introduction To Databases IDIA 618 Fall 2014 Bridget M. Blodgett.
DAY 21: MICROSOFT ACCESS – CHAPTER 5 MICROSOFT ACCESS – CHAPTER 6 MICROSOFT ACCESS – CHAPTER 7 Akhila Kondai October 30, 2013.
Oracle Data Definition Language (DDL)
XML, distributed databases, and OLAP/warehousing The semantic web and a lot more.
Creating a Basic Web Page
PHP Programming with MySQL Slide 8-1 CHAPTER 8 Working with Databases and MySQL.
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 7 INTRODUCTION TO STRUCTURED QUERY LANGUAGE (SQL) Instructor Ms. Arwa.
Chapter 16 The World Wide Web. 2 The Web An infrastructure of information combined and the network software used to access it Web page A document that.
16-1 The World Wide Web The Web An infrastructure of distributed information combined with software that uses networks as a vehicle to exchange that information.
XML CPSC 315 – Programming Studio Fall 2008 Project 3, Lecture 1.
DATABASE and XML Moussa Mané. Learning Objectives ● Learn about Native XML Databases ● Learn about the conversion technology available ● Understand New.
1 PHP and MySQL. 2 Topics  Querying Data with PHP  User-Driven Querying  Writing Data with PHP and MySQL PHP and MySQL.
Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D.
PHP meets MySQL.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
Chapter 27 The World Wide Web and XML. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.27-2 Topics in this Chapter The Web and the Internet.
SQL Queries Relational database and SQL MySQL LAMP SQL queries A MySQL Tutorial and applications Database Building Assignment.
Tech Terminology for non-technical people Tim Bornholtz 2006 Annual Conference.
Databases From A to Boyce Codd. What is a database? It depends on your point of view. For Manovich, a database is a means of structuring information in.
Winter 2006Keller, Ullman, Cushing18–1 Plan 1.Information integration: important new application that motivates what follows. 2.Semistructured data: a.
Copyrighted material John Tullis 10/17/2015 page 1 04/15/00 XML Part 3 John Tullis DePaul Instructor
Discovering Computers Fundamentals Fifth Edition Chapter 9 Database Management.
DAY 12: DATABASE CONCEPT Tazin Afrin September 26,
7 1 Chapter 7 Introduction to Structured Query Language (SQL) Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
M1G Introduction to Database Development 2. Creating a Database.
CHAPTER 3 DATABASES AND DATA WAREHOUSES. 2 OPENING CASE STUDY Chrysler Spins a Competitive Advantage with Supply Chain Management Software Chapter 2 –
Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies.
6 1 Lecture 8: Introduction to Structured Query Language (SQL) J. S. Chou, P.E., Ph.D.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Chapter 27 The World Wide Web and XML. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.27-2 Topics in this Chapter The Web and the Internet.
XML and Database.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
IS2803 Developing Multimedia Applications for Business (Part 2) Lecture 1: Introduction to IS2803 Rob Gleasure
Introduction to Teradata Client Tools. 2 Introduction to Teradata SQL  OBJECTIVES :  Teradata Product Components.  Accessing Teradata – Database /
Starting with Oracle SQL Plus. Today in the lab… Connect to SQL Plus – your schema. Set up two tables. Find the tables in the catalog. Insert four rows.
1 A Very Brief Introduction to Relational Databases.
SQL Injection Attacks An overview by Sameer Siddiqui.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
LM 5 Introduction to SQL MISM 4135 Instructor: Dr. Lei Li.
Retele de senzori Curs 2 - 1st edition UNIVERSITATEA „ TRANSILVANIA ” DIN BRAŞOV FACULTATEA DE INGINERIE ELECTRICĂ ŞI ŞTIINŢA CALCULATOARELOR.
MICROSOFT ACCESS – CHAPTER 5 MICROSOFT ACCESS – CHAPTER 6 MICROSOFT ACCESS – CHAPTER 7 Sravanthi Lakkimsety Mar 14,2016.
1 Section 1 - Introduction to SQL u SQL is an abbreviation for Structured Query Language. u It is generally pronounced “Sequel” u SQL is a unified language.
XML: Extensible Markup Language
Unit 4 Representing Web Data: XML
Chapter 7 Representing Web Data: XML
ISC440: Web Programming 2 Server-side Scripting PHP 3
Relational Database Design
Presentation transcript:

1Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OCL3 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY January, 2005

2 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Agenda Security & web apps RegEx support in 10g Oracle & XML Data warehousing More on the PL/SQL labs Any more lab?

3 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Review: Why security is hard It’s a “negative deliverable” It’s an asymmetric threat Tolstoy: “Happy families are all alike; every unhappy family is unhappy in its own way.”  Analogs: “homeland”, jails, debugging, proof- reading, Popperian science, fishing, MC algs So: fix biggest problems first

4 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 DB users have privileges SELECT : read access to all columns INSERT(col-name) : can insert rows with non- default values in this column INSERT : can insert rows with non-default values in all columns DELETE REFERENCES(col-name) : can define foreign keys that refer to (or other constraints that mention) this column TRIGGER : triggers can reference table EXECUTE : can run function/SP

5 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Granting privileges (Oracle) One method of setting access levels Creator of object automatically gets all privileges to it  Possible objects: tables, whole databases, stored functions/procedures, etc. .* - all tables in DB A privileged user can grant privileges to other users or groups GRANT privileges ON object TO user GRANT SELECT ON mytable TO someone WITH GRANT OPTION;

6 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Granting and revoking Privileged user has privileges Privileged-WGO user can grant them, w/wo GO Granter can revoke privileges or GO Revocation cascades by default  To prevent, use RESTRICT (at end of cmd)  If would cascade, command fails Can change owner: ALTER TABLE my-tbl OWNER TO new-owner; ALTER TABLE my-tbl OWNER TO new-owner;

7 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Granting and revoking What we giveth, we may taketh away mjohnson: (effects?) george: (effects?) mjohnson: (effects?) GRANT SELECT, INSERT ON my-table TO george WITH GRANT OPTION; GRANT SELECT ON my-table TO laura; REVOKE SELECT ON my-table FROM laura;

8 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Role-based authorization In SQL-1999, privileges assigned with roles For example:  Student role  Instructor role  Admin role Each role gets to do same (sorts of) things Privileges assigned by assigning role to users GRANT SELECT ON my-table TO employee; GRANT employee TO billg;

9 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Passwords DBMS recognizes your privileges because it recognizes you  how?  Storing passwords in the DB is a bad idea

10 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Hashed or digested passwords One-way hash function: 1. computing f(x) is easy; 2. Computing f -1 (y) is hard/impossible; 3. Finding some x2 s.t. f(x2) = f(x) is hard/imposs “collisions” Intuitively: seeing f(x) gives little (useful) info on x  x “looks random”  PRNGs MD5, SHA-1 RFID for cars:

11 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Built-in accounts Many DBMSs (and OSs) have built-in demo accounts by default  In some versions, must “opt out” MySQL: root/(blank) (closed on sales)  Oracle: scott/tiger (was open on sales last year) SQLServer: sa/(blank/null) 

12 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Query-related: Injection attacks Here’s a situation:  Prompt for user/pass  Do lookup:  If found, user gets in test.user table in MySQL p.txt p.txt  Apart from no hashing, is this safe? SELECT * FROM users WHERE user=u AND password=p; SELECT * FROM users WHERE user=u AND password=p;

13 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Injection attacks We expect to get input of something like:  user: mjohnson  pass: secret  SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user= 'mjohnson' AND password = 'secret'; SELECT * FROM users WHERE user= 'mjohnson' AND password = 'secret';

14 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Injection attacks – MySQL/Perl/PHP Consider another input:  user: ' OR 1=1 OR user = '  pass: ' OR 1=1 OR pass = '  SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = ' ' OR 1=1 OR user = ' ' AND password = ' ' OR 1=1 OR pass = ' '; SELECT * FROM users WHERE user = ' ' OR 1=1 OR user = ' ' AND password = ' ' OR 1=1 OR pass = ' '; SELECT * FROM users WHERE user = '' OR 1=1 OR user = '' AND password = '' OR 1=1 OR pass = ''; SELECT * FROM users WHERE user = '' OR 1=1 OR user = '' AND password = '' OR 1=1 OR pass = '';

15 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Injection attacks – MySQL/Perl/PHP Consider this one:  user: your-boss ' OR 1=1 #  pass: abc  SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = 'your-boss ' OR 1=1 #' AND password = 'abc'; SELECT * FROM users WHERE user = 'your-boss ' OR 1=1 #' AND password = 'abc'; SELECT * FROM users WHERE user = 'your-boss' OR 1=1 #' AND password = 'abc'; SELECT * FROM users WHERE user = 'your-boss' OR 1=1 #' AND password = 'abc';

16 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Injection attacks – MySQL/Perl/PHP Consider another input:  user: your-boss  pass: ' OR 1=1 OR pass = '  SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = 'your-boss' AND password = ' ' OR 1=1 OR pass = ' '; SELECT * FROM users WHERE user = 'your-boss' AND password = ' ' OR 1=1 OR pass = ' '; SELECT * FROM users WHERE user = 'your-boss' AND password = '' OR 1=1 OR pass = ''; SELECT * FROM users WHERE user = 'your-boss' AND password = '' OR 1=1 OR pass = '';

17 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Multi-command inj. attacks (other DBs) Consider another input:  user: ' ; DELETE FROM users WHERE user = ' abc ' ; SELECT FROM users WHERE password = '  pass: abc  SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = ' ' ; DELETE FROM users WHERE user = 'abc'; SELECT FROM users WHERE password = ' ' AND password = 'abc'; SELECT * FROM users WHERE user = ' ' ; DELETE FROM users WHERE user = 'abc'; SELECT FROM users WHERE password = ' ' AND password = 'abc'; SELECT * FROM users WHERE user = ''; DELETE FROM users WHERE user = 'abc'; SELECT FROM users WHERE password = '' AND password = 'abc'; SELECT * FROM users WHERE user = ''; DELETE FROM users WHERE user = 'abc'; SELECT FROM users WHERE password = '' AND password = 'abc';

18 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Consider another input:  user: ' ; DROP TABLE users; SELECT FROM users WHERE password = '  pass: abc  SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = ' ' ; DROP TABLE users; SELECT FROM users WHERE password = ' ' AND password = 'abc'; SELECT * FROM users WHERE user = ' ' ; DROP TABLE users; SELECT FROM users WHERE password = ' ' AND password = 'abc'; SELECT * FROM users WHERE user = ''; DROP TABLE users; SELECT FROM users WHERE password = '' AND password = 'abc'; SELECT * FROM users WHERE user = ''; DROP TABLE users; SELECT FROM users WHERE password = '' AND password = 'abc'; Multi-command inj. attacks (other DBs)

19 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Consider another input:  user: ' ; SHUTDOWN WITH NOWAIT; SELECT FROM users WHERE password = '  pass: abc  SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = ' ' ; SHUTDOWN WITH NOWAIT; SELECT FROM users WHERE password = ' ' AND password = 'abc'; SELECT * FROM users WHERE user = ' ' ; SHUTDOWN WITH NOWAIT; SELECT FROM users WHERE password = ' ' AND password = 'abc'; SELECT * FROM users WHERE user = ''; SHUTDOWN WITH NOWAIT; SELECT FROM users WHERE password = '' AND password = 'abc'; SELECT * FROM users WHERE user = ''; SHUTDOWN WITH NOWAIT; SELECT FROM users WHERE password = '' AND password = 'abc'; Multi-command inj. attacks (other DBs)

20 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Injection attacks – MySQL/Perl/PHP Consider another input:  user: your-boss  pass: ' OR 1=1 AND user = 'your-boss  Delete your boss! DELETE FROM users WHERE user = u AND password = p; DELETE FROM users WHERE user = u AND password = p; DELETE FROM users WHERE user = 'your-boss' AND pass = ' ' OR 1=1 AND user = ' your-boss'; DELETE FROM users WHERE user = 'your-boss' AND pass = ' ' OR 1=1 AND user = ' your-boss'; DELETE FROM users WHERE user = 'your-boss' AND pass = '' OR 1=1 AND user = 'your-boss'; DELETE FROM users WHERE user = 'your-boss' AND pass = '' OR 1=1 AND user = 'your-boss';

21 Matthew P. Johnson, OCL3, CISDD CUNY, June Injection attacks – MySQL/Perl/PHP Consider another input:  user: ' OR 1=1 OR user = '  pass: ' OR 1=1 OR user = '  Delete everyone! DELETE FROM users WHERE user = u AND pass = p; DELETE FROM users WHERE user = u AND pass = p; DELETE FROM users WHERE user = ' ' OR 1=1 OR user = ' ' AND pass = ' ' OR 1=1 OR user = ' '; DELETE FROM users WHERE user = ' ' OR 1=1 OR user = ' ' AND pass = ' ' OR 1=1 OR user = ' '; DELETE FROM users WHERE user = '' OR 1=1 OR user = '' AND pass = '' OR 1=1 OR user = ''; DELETE FROM users WHERE user = '' OR 1=1 OR user = '' AND pass = '' OR 1=1 OR user = '';

22 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Preventing injection attacks Ultimate source of problem: quotes Soln 1: don’t allow quotes!  Reject any entered data containing single quotes Q: Is this satisfactory?  Does Amazon need to sell O’Reilly books? Soln 2: escape any single quotes  Replace any ' with a '' or \'  In Perl, use taint mode – won’t show  In PHP, turn on magic_quotes_gpc flag in.htaccess show both PHP versions

23 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Preventing injection attacks Soln 3: use prepare parameter-based queries  Supported in JDBC, Perl DBI, PHP ext/mysqli   Very dangerous: using tainted data to run commands at the Unix command prompt  Semi-colons, prime char, etc.  Safest: define set if legal chars, not illegal ones

24 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Preventing injection attacks When to do security checking for quotes, etc.? Natural choice: in client-side data validation But not enough!  As saw earlier: can submit GET and POST params manually  Must do security checking on server

25 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 More Info phpGB MySQL Injection Vulnerability  "How I hacked PacketStorm“ 

26 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 SQL*Plus settings SQL> SET RECSEP OFF SQL> COLUMN text FORMAT A60 SQL> SET RECSEP OFF SQL> COLUMN text FORMAT A60

27 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 New topic: Regular Expressions In automata theory, Finite Automata are the simplest weakest of computer, Turing Machines the strongest  Chomsky’s Hierarchy FA are equivalent to a regular expression  Expressions that specify a pattern  Can check whether a string matches the pattern

28 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 RegEx matching Use REGEX_LIKE Metachar for any char is. First, get employee_comment table:  Now do search: So far, like LIKE SELECT emp_id, text FROM employee_comment WHERE REGEXP_LIKE(text,' '); SELECT emp_id, text FROM employee_comment WHERE REGEXP_LIKE(text,' ');

29 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 RegEx matching Can also pull out the matching text with REGEXP_SUBSTR: If want only numbers, can specify a set of chars rather than a dot: SELECT emp_id, REGEXP_SUBSTR(text,' ') text FROM employee_comment WHERE REGEXP_LIKE(text,' '); SELECT emp_id, REGEXP_SUBSTR(text,' ') text FROM employee_comment WHERE REGEXP_LIKE(text,' '); SELECT emp_id, REGEXP_SUBSTR(text, '[ ]..-...[ ]') text FROM employee_comment WHERE REGEXP_LIKE(text, '[ ]..-...[ ]'); SELECT emp_id, REGEXP_SUBSTR(text, '[ ]..-...[ ]') text FROM employee_comment WHERE REGEXP_LIKE(text, '[ ]..-...[ ]');

30 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 RegEx matching Or can specify a range of chars: Or, finally, can state how many copies to match: SELECT emp_id, REGEXP_SUBSTR(text, '[0-9] ') text FROM employee_comment WHERE REGEXP_LIKE(text,' '); SELECT emp_id, REGEXP_SUBSTR(text, '[0-9] ') text FROM employee_comment WHERE REGEXP_LIKE(text,' '); SELECT emp_id, REGEXP_SUBSTR(text, '[0-9]{3}-[0-9]{4}') text FROM employee_comment WHERE REGEXP_LIKE(text,'[0-9]{3}-[0-9]{4}'); SELECT emp_id, REGEXP_SUBSTR(text, '[0-9]{3}-[0-9]{4}') text FROM employee_comment WHERE REGEXP_LIKE(text,'[0-9]{3}-[0-9]{4}');

31 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 RegExp matching Other operators:  * - 0 or more matches  or more matches  ? - 0 or 1 match Also, can OR options together with | op Here: some phone nums have area codes, some not, so want to match both: SELECT emp_id, REGEXP_SUBSTR(text, '[0-9]{3}-[0-9]{3}-[0-9]{4}|[0-9]{3}- [0-9]{4}') text FROM employee_comment WHERE REGEXP_LIKE(text,'[0-9]{3}-[0-9]{3}- [0-9]{4}|[0-9]{3}-[0-9]{4}'); SELECT emp_id, REGEXP_SUBSTR(text, '[0-9]{3}-[0-9]{3}-[0-9]{4}|[0-9]{3}- [0-9]{4}') text FROM employee_comment WHERE REGEXP_LIKE(text,'[0-9]{3}-[0-9]{3}- [0-9]{4}|[0-9]{3}-[0-9]{4}');

32 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 RegExp matching Order of ORed together patterns matters:  First matching pattern wins SELECT emp_id, REGEXP_SUBSTR(text, '[0-9]{3}-[0-9]{4}|[0-9]{3}-[0-9]{3}- [0-9]{4}') text FROM employee_comment WHERE REGEXP_LIKE(text,'[0-9]{3}-[0- 9]{4}|[0-9]{3}-[0-9]{3}-[0-9]{4}'); SELECT emp_id, REGEXP_SUBSTR(text, '[0-9]{3}-[0-9]{4}|[0-9]{3}-[0-9]{3}- [0-9]{4}') text FROM employee_comment WHERE REGEXP_LIKE(text,'[0-9]{3}-[0- 9]{4}|[0-9]{3}-[0-9]{3}-[0-9]{4}');

33 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 RegExp matching There’s a shared structure between the two, tho  Area code is just optional  Can use ? op SELECT emp_id, REGEXP_SUBSTR(text, '([0-9]{3}-)?[0-9]{3}-[0-9]{4}') text FROM employee_comment WHERE REGEXP_LIKE(text,'([0-9]{3}-)?[0- 9]{3}-[0-9]{4}'); SELECT emp_id, REGEXP_SUBSTR(text, '([0-9]{3}-)?[0-9]{3}-[0-9]{4}') text FROM employee_comment WHERE REGEXP_LIKE(text,'([0-9]{3}-)?[0- 9]{3}-[0-9]{4}');

34 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 RegExp matching Also, different kinds of separators:  dash, dot, just blank Can OR together whole number patterns Better: Just use set of choices of each sep. SELECT emp_id, REGEXP_SUBSTR(text, '([0- 9]{3}[-. ])?[0-9]{3}[-. ][0-9]{4}') text FROM employee_comment WHERE REGEXP_LIKE(text,'([0-9]{3}[-. ])?[0- 9]{3}[-. ][0-9]{4}'); SELECT emp_id, REGEXP_SUBSTR(text, '([0- 9]{3}[-. ])?[0-9]{3}[-. ][0-9]{4}') text FROM employee_comment WHERE REGEXP_LIKE(text,'([0-9]{3}[-. ])?[0- 9]{3}[-. ][0-9]{4}');

35 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 RegExp matching One other thing: area codes in parentheses  Of course, area codes are still optional  Parentheses must be escaped - \( \) SELECT emp_id, REGEXP_SUBSTR(text, '([0- 9]{3}[-. ]|\([0-9]{3}\) )?[0-9]{3}[-. ][0- 9]{4}') text FROM employee_comment WHERE REGEXP_LIKE(text,'([0-9]{3}[-. ]|\([0- 9]{3}\) )?[0-9]{3}[-. ][0-9]{4}'); SELECT emp_id, REGEXP_SUBSTR(text, '([0- 9]{3}[-. ]|\([0-9]{3}\) )?[0-9]{3}[-. ][0- 9]{4}') text FROM employee_comment WHERE REGEXP_LIKE(text,'([0-9]{3}[-. ]|\([0- 9]{3}\) )?[0-9]{3}[-. ][0-9]{4}');

36 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 And now for something completely different: XML XML: eXtensible Mark-up Language Very popular language for semi-structured data Mark-up language: consists of elements composed of tags, like HTML Emerging lingua franca of the Internet, Web Services, inter-vender comm

37 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Unstructured data At one end of continuum: unstructured data  Text files  Stock market prices  CIA intelligence intercepts  Audio recordings  “Just one damn bit after another” ~ Henry Ford No (intentional, formal) patterns to the data Difficult to manage/make sense of  Why we need data-mining

38 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Structured data At the other end: structured data  Tables in RDBMSs  Data organized into semantic chunks entities  Similar/related entities grouped together Relationships, classes  Entities in same group have same structure Same fields/attributes/properties Easy to make sense of  But sometimes too rigid a req.  Difficult to send—convert to tab-delimited

39 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Semi-structured data Not too random  Data organized into entities  Similar/related grouped to form other entities Not too structured  Some attributes may be missing  Size of attributes may vary Support of lists/sets Juuust Right  Data is self-describing

40 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Semi-structured data Predominant examples:  HTML: HyperText Mark-up Language  XML: eXtensible Mark-up Language NB: both mark-up languages (use tags) Mark-up lends self of semi-structured data  Demarcate boundaries for entities  But freely allow other entities inside

41 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Data model for semi-structured data Usually represented as directed graphs Graph: set of vertices (nodes) and edges  Dots connected by lines; not nec. a tree! In model,  Nodes ~ entities or fields/attributes  Edges ~ attribute-of/sub-entity-of Example: publisher publishes >=0 books  Each book has one title, one year, >=1 authors  Draw publishers graph

42 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 XML is a SSD language Standard published by W3C  Officially announced/recommended in 1998 XML != HTML  XML != a replacement for HTML  Both are mark-up languages Big diffs: 1. XML doesn’t use predefined tags (!) But it’s extensible: tags can be added 2. HTML is about presentation:,, XML is about content:,

43 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 XML syntax Like HTML in many respects but more strict All tags must be closed  Can’t have: this is a line  Every start tag has an end tag  Although style can replace both IS case-sensitive IS space-sensitive XML doc has a unique root element

44 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 XML syntax Tags must be properly nested  Not allowed I’m not kidding  Intuition: file folders Elements may have quoted attributes  … Comments same as in HTML:  Draw publishers XML

45 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Escape chars in XML Some chars must be escaped  Distinguish content from syntax Can also declare value to be pure text: >< <> && "" '&apos; jsdljsd <>>]]> 3 < 5 "Don&apos;t call me &apos;Ishmael&apos;!"

46 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 XML Namespaces Different schemas/DTDs may overlap  XHTML and MathML share some tags Soln: namespaces  as in Java/C++/C#

47 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Michael 123 Hilary 456 Bill 789 Michael 123 Hilary 456 Bill 789 row name ssn “Michael”123“Hilary”“Bill” persons XML: persons From Relational Data to XML Data NameSSNMailing-address Michael123NY Hilary456DC Bill789Chappaqua

48 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Semi-structured Data Explained List-valued attributes  XML is not 1NF! Impossible in (single, BCNF) tables:  two phones! namephone Bill ??? Hilary Bill Hilary Bill

49 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Object ids and References SSD graph might not be trees! But XML docs must be Would cause much redundancy Soln: same concept as pointers in C/C++/J  Object ids and references Graph example:  Movies: Lost in Translation, Hamlet  Stars: Bill Murray, Scarlet Johansson Lost in Translation 2003 Hamlet 1999 Bill Murray Lost in Translation 2003 Hamlet 1999 Bill Murray

50 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 What do we do with XML? Things done with XML:  Send to partners  Parse XML received  Convert to RDBMS rows  Query for particular data  Convert to other XML  Convert to formats other than XML Lots of tools/standards for these…

51 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 DTDs & understanding XML XML is extensible Advantage: when creating, we can use any tags we like Disadv: when reading, they can use any tags they like  Using XML docs a priori is very difficult Solution: impose some constraints

52 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 DTDs DTD: Document Type Definition You and partners/vertical industry/academic discipline decide on a DTD/schema for your docs  Specify which entities you may use/must understand  Specify legal relationships DTD specifies the grammar to be used  DTD = set of rules for creating valid entities DTD tells your software what to look for in doc

53 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 DTD examples Well-formed XML v. valid XML Simple example:     Copy from: Partial publisher example rules:  Root  publisher  Publisher  name, book*, author*  Book  title, date, author+  Author  firstname, middlename?, lastname

54 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Partial DTD example (typos!) <!DOCTYPE PUBLISHER [ <!DOCTYPE PUBLISHER [ DTD is not XML, but can be embedded in or ref.ed from XML Replacement for DTDs is XML Schema

55 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 XML Applications/dialects MathML: Mathematical Markup Language  ations/ictp99/ictp99N8059.html ations/ictp99/ictp99N8059.html VoiceXML: es/rps.xml es/rps.xml ChemML: Chemical Markup Language XHMTL: HTML retrofitted as an XML application

56 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 XML Applications/dialects VoiceXML:   AT&T Directory Assistance  Image from

57 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 More XML Apps FIXML  XML equiv. of FIX: Financial Information eXchange swiftML  XML equiv. of SWIFT: Society for Worldwide Interbank Financial Telecommunications message format Apache’s Ant  Scripting language for Java build management  Many more: 

58 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 More XML Applications/Protocols RSS: Rich Site Summary/Really Simple Syndication  News sites, blogs…    Screenshot  More info: my channel story 1 … // other items my channel story 1 … // other items

59 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 More XML Applications/Protocols SOAP: Simple Object Access Protocol  XML-based messaging format  Used by Google API:  Amazon API:  Amazon light:  Other examples: 10&topic=&topic_set= 10&topic=&topic_set SOAP envelope with header and body  Request sales tax for total <SOAP:Envelope xmlns:SOAP="urn:schemas-xmlsoap-org:soap.v1"> 100 <SOAP:Envelope xmlns:SOAP="urn:schemas-xmlsoap-org:soap.v1"> 100

60 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 More XML Applications/Protocols %(key)s 0 10 true false %(key)s 0 10 true false

61 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 New topic: XML in Oracle - purchase-order e.g Alpha Tech 11257> AI EI-T Alpha Tech 11257> AI EI-T

62 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Storing XML data As of 9i, has XMLType data type  By default, underlying storage is as CLOB CREATE TABLE purchase_order( po_id number(5) not null, customer_po_nbr varchar(20), customer_inception_date date, order_nbr number(5), purchase_order_doc xmltype, constraint purchase_order_pk primary key(po_id) ); CREATE TABLE purchase_order( po_id number(5) not null, customer_po_nbr varchar(20), customer_inception_date date, order_nbr number(5), purchase_order_doc xmltype, constraint purchase_order_pk primary key(po_id) );

63 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Loading XML into Oracle First, log in as sys: Now scott can import: connect sys/junk as sysdba create directory xml_data as '/xml'; grant read, write on directory xml_data to scott; connect sys/junk as sysdba create directory xml_data as '/xml'; grant read, write on directory xml_data to scott; connect scott/tiger declare bf1 bfile; begin bf1 := bfilename('XML_DATA', 'purch_ord.xml'); insert into purchase_order(po_id, purchase_order_doc) values(1000, xmltype(bf1, nls_charset_id('we8mswin1252'))); end; connect scott/tiger declare bf1 bfile; begin bf1 := bfilename('XML_DATA', 'purch_ord.xml'); insert into purchase_order(po_id, purchase_order_doc) values(1000, xmltype(bf1, nls_charset_id('we8mswin1252'))); end;

64 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Loading XML into Oracle Not just loading raw text  XMLType data must be well-formed  Parsable as XML Try modifying customer_name open tag

65 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Accessing XML in Oracle Now can look at raw XML: Can also use XPath to extract particular nodes and values, with extract function: SQL> SELECT purchase_order_doc FROM purchase_order; SQL> SELECT purchase_order_doc FROM purchase_order; SQL> SELECT extract(purchase_order_doc, '/purchase_order/customer_name') FROM purchase_order; SQL> SELECT extract(purchase_order_doc, '/purchase_order/customer_name') FROM purchase_order;

66 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 XPath in Oracle Can also extract all nodes of one type, underneath some node, with double-slash //  All purchase order items NB: this is not valid XML  No unique root  Can request just one with bracket op  Numbering starts at 1, not 0  Wrong name/number  no error, no results SQL> SELECT extract(purchase_order_doc, '/purchase_order/po_items/item[2]') FROM purchase_order; SQL> SELECT extract(purchase_order_doc, '/purchase_order/po_items/item[2]') FROM purchase_order; SQL> SELECT extract(purchase_order_doc, '/purchase_order//item') FROM purchase_order; SQL> SELECT extract(purchase_order_doc, '/purchase_order//item') FROM purchase_order;

67 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 extract v. extractvalue extractvalue returns value, not whole node: vs. extractvalue applies only to unique nodes: SQL> SELECT extractvalue(purchase_order_doc, '/purchase_order/customer_name') FROM purchase_order; SQL> SELECT extractvalue(purchase_order_doc, '/purchase_order/customer_name') FROM purchase_order; SQL> SELECT extract(purchase_order_doc, '/purchase_order/customer_name') FROM purchase_order; SQL> SELECT extract(purchase_order_doc, '/purchase_order/customer_name') FROM purchase_order; SQL> SELECT extractvalue(purchase_order_doc, '/purchase_order/po_items') FROM purchase_order; SQL> SELECT extractvalue(purchase_order_doc, '/purchase_order/po_items') FROM purchase_order;

68 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 existsnode function Can check whether node/location exists with existnode function  Returns 1 or 0 Also applies to bracketed paths: SQL> SELECT po_id FROM purchase_order WHERE existsnode(purchase_order_doc, '/purchase_order/customer_name') = 1; SQL> SELECT po_id FROM purchase_order WHERE existsnode(purchase_order_doc, '/purchase_order/customer_name') = 1; SQL> SELECT po_id FROM purchase_order WHERE existsnode(purchase_order_doc, '/purchase_order/po_items/item[1]') = 1; SQL> SELECT po_id FROM purchase_order WHERE existsnode(purchase_order_doc, '/purchase_order/po_items/item[1]') = 1;

69 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Moving data from XML to relations To move single values from XML to tables, can simply use extractvalue in UPDATE statements: SQL> UPDATE purchase_order SET order_nbr = 7101, customer_po_nbr = extractvalue(purchase_order_doc, '/purchase_order/po_number'), customer_inception_date = to_date(extractvalue(purchase_order_doc, '/purchase_order/po_date'), 'yyyy-mm-dd'); SQL> UPDATE purchase_order SET order_nbr = 7101, customer_po_nbr = extractvalue(purchase_order_doc, '/purchase_order/po_number'), customer_inception_date = to_date(extractvalue(purchase_order_doc, '/purchase_order/po_date'), 'yyyy-mm-dd');

70 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Moving data from XML to relations What about moving set of nodes  The two item nodes Use xmlsequence to get a varray of items  Use TABLE to convert to a relation SQL> SELECT extract(purchase_order_doc, '/purchase_order//item') FROM purchase_order; SQL> SELECT extract(purchase_order_doc, '/purchase_order//item') FROM purchase_order; SQL> SELECT rownum, item.* FROM TABLE( SELECT xmlsequence(extract(purchase_order_doc, '/purchase_order//item')) FROM purchase_order) item; SQL> SELECT rownum, item.* FROM TABLE( SELECT xmlsequence(extract(purchase_order_doc, '/purchase_order//item')) FROM purchase_order) item;

71 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Moving data from XML to relations Result is a two-row relation with XMLTypes Can use extractvalue to extract this data First, create destination table: CREATE TABLE LINE_ITEM( ORDER_NBR NUMBER(9) NOT NULL, PART_NBR VARCHAR2(20) NOT NULL, QTY NUMBER(5) NOT NULL, FILLED_QTY NUMBER(5), CONSTRAINT line_item_pk PRIMARY KEY (ORDER_NBR,PART_NBR)); CREATE TABLE LINE_ITEM( ORDER_NBR NUMBER(9) NOT NULL, PART_NBR VARCHAR2(20) NOT NULL, QTY NUMBER(5) NOT NULL, FILLED_QTY NUMBER(5), CONSTRAINT line_item_pk PRIMARY KEY (ORDER_NBR,PART_NBR));

72 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Moving data from XML to relations Then insert results: SQL> INSERT INTO line_item(order_nbr,part_nbr,qty) SELECT 7109, extractvalue(column_value, '/item/part_number'), extractvalue(column_value, '/item/quantity') FROM TABLE( SELECT xmlsequence(extract(purchase_order_doc, '/purchase_order//item')) FROM purchase_order ); SQL> INSERT INTO line_item(order_nbr,part_nbr,qty) SELECT 7109, extractvalue(column_value, '/item/part_number'), extractvalue(column_value, '/item/quantity') FROM TABLE( SELECT xmlsequence(extract(purchase_order_doc, '/purchase_order//item')) FROM purchase_order );

73 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 XML Schemas and Oracle By default, XML must be well-formed to be read into the XMLType field XML is valid if it conforms to a schema To use a schema with Oracle, must first register it: declare bf1 bfile; begin bf1 := bfilename('XML_DATA', 'purch_ord.xsd'); dbms_xmlschema.registerschema( ' /schemas/purch_ord.xsd', bf1); end; declare bf1 bfile; begin bf1 := bfilename('XML_DATA', 'purch_ord.xsd'); dbms_xmlschema.registerschema( ' /schemas/purch_ord.xsd', bf1); end;

74 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 XML Schemas and Oracle With schema registered, can apply it to an XMLType field CREATE TABLE purchase_order2 (po_id NUMBER(5) NOT NULL, customer_po_nbr VARCHAR2(20), customer_inception_date DATE, order_nbr NUMBER(5), purchase_order_doc XMLTYPE, CONSTRAINT purchase_order2_pk PRIMARY KEY (po_id)) XMLTYPE COLUMN purchase_order_doc XMLSCHEMA " ch_ord.xsd" ELEMENT "purchase_order"; CREATE TABLE purchase_order2 (po_id NUMBER(5) NOT NULL, customer_po_nbr VARCHAR2(20), customer_inception_date DATE, order_nbr NUMBER(5), purchase_order_doc XMLTYPE, CONSTRAINT purchase_order2_pk PRIMARY KEY (po_id)) XMLTYPE COLUMN purchase_order_doc XMLSCHEMA " ch_ord.xsd" ELEMENT "purchase_order";

75 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Importing to schema field Try to import xml file, get error: declare bf1 bfile; begin bf1 := bfilename('XML_DATA', 'purch_ord.xml'); insert into purchase_order2(po_id, purchase_order_doc) values (2000, XMLTYPE(bf1, nls_charset_id('WE8MSWIN1252'))); end; declare bf1 bfile; begin bf1 := bfilename('XML_DATA', 'purch_ord.xml'); insert into purchase_order2(po_id, purchase_order_doc) values (2000, XMLTYPE(bf1, nls_charset_id('WE8MSWIN1252'))); end;

76 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Importing to schema field Root node of XML must specify the schema Change root to the following: Now can import Also fails if extra or missing nodes  Modify company_name node  Add new comments node <purchase_order xmlns:xsi=" xsi:noNamespaceSchemaLocation=" ome/xml/schemas/purch_ord.xsd"> <purchase_order xmlns:xsi=" xsi:noNamespaceSchemaLocation=" ome/xml/schemas/purch_ord.xsd">

77 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Can check to see whether schema is used Can call isSchemaBased(), getSchemaURL() and isSchemaValid() on XMLType fields: SQL> select po.purchase_order_doc.isSchemaBased(), po.purchase_order_doc.getSchemaURL(), po.purchase_order_doc.isSchemaValid() from purchase_order2 po; SQL> select po.purchase_order_doc.isSchemaBased(), po.purchase_order_doc.getSchemaURL(), po.purchase_order_doc.isSchemaValid() from purchase_order2 po;

78 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Updating XMLType data Can update XMLType data with ordinary UPDATE statements: Replaces whole XMLType object with new one SQL> UPDATE purchase_order po SET po.purchase_order_doc = XMLTYPE(BFILENAME('XML_DATA', 'purch_ord_alt.xml'), nls_charset_id('WE8MSWIN1252')) WHERE po.po_id = 2000; SQL> UPDATE purchase_order po SET po.purchase_order_doc = XMLTYPE(BFILENAME('XML_DATA', 'purch_ord_alt.xml'), nls_charset_id('WE8MSWIN1252')) WHERE po.po_id = 2000;

79 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Updating XMLType data Can also modify the existing XMLType object  By writing node values updateXML() function does search/replace  But searches for node, not value SQL> SELECT extract(po.purchase_order_doc, '/purchase_order/customer_name') FROM purchase_order po WHERE po_id = 1000; SQL> UPDATE purchase_order po SET po.purchase_order_doc = updateXML(po.purchase_order_doc, '/purchase_order/customer_name/text()', 'some other company') WHERE po.po_id = 1000; SQL> SELECT extract(po.purchase_order_doc, '/purchase_order/customer_name') FROM purchase_order po WHERE po_id = 1000; SQL> UPDATE purchase_order po SET po.purchase_order_doc = updateXML(po.purchase_order_doc, '/purchase_order/customer_name/text()', 'some other company') WHERE po.po_id = 1000;

80 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Updating XMLType data Can also write whole node, using XMLType: Validation/well-formedness is still checked SQL> UPDATE purchase_order po SET po.purchase_order_doc = updateXML(po.purchase_order_doc, '/purchase_order/customer_name', XMLTYPE(' some third company ')) WHERE po.po_id = 1000; SQL> SELECT extract(po.purchase_order_doc, '/purchase_order/customer_name') FROM purchase_order po WHERE po_id = 1000; SQL> UPDATE purchase_order po SET po.purchase_order_doc = updateXML(po.purchase_order_doc, '/purchase_order/customer_name', XMLTYPE(' some third company ')) WHERE po.po_id = 1000; SQL> SELECT extract(po.purchase_order_doc, '/purchase_order/customer_name') FROM purchase_order po WHERE po_id = 1000;

81 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Updating XMLType data And can update items in a collection: SQL> SELECT extract(po.purchase_order_doc, '/purchase_order//item') FROM purchase_order po WHERE po.po_id = 1000; SQL> UPDATE purchase_order po SET po.purchase_order_doc = updateXML(po.purchase_order_doc, '/purchase_order/po_items/item[1]', XMLTYPE(' T ')) WHERE po.po_id = 1000; SQL> SELECT extract(po.purchase_order_doc, '/purchase_order//item') FROM purchase_order po WHERE po.po_id = 1000; SQL> UPDATE purchase_order po SET po.purchase_order_doc = updateXML(po.purchase_order_doc, '/purchase_order/po_items/item[1]', XMLTYPE(' T ')) WHERE po.po_id = 1000;

82 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Converting relational data to XML Saw how to put XML in a table Conversely, can convert ordinary relational data to XML  XMLElement() generates an XML node First, create supplier table: CREATE TABLE SUPPLIER( SUPPLIER_ID NUMBER(5) NOT NULL, NAME VARCHAR2(30) NOT NULL, PRIMARY KEY (SUPPLIER_ID)); insert into supplier values(1, 'Acme'); insert into supplier values(2, 'Tilton'); insert into supplier values(3, 'Eastern'); CREATE TABLE SUPPLIER( SUPPLIER_ID NUMBER(5) NOT NULL, NAME VARCHAR2(30) NOT NULL, PRIMARY KEY (SUPPLIER_ID)); insert into supplier values(1, 'Acme'); insert into supplier values(2, 'Tilton'); insert into supplier values(3, 'Eastern');

83 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Converting relational data to XML Now can call XMLElement function to wrap values in tags: And can build it up: Don’t concatenate! Turns to strings, escapes  Error in book SELECT XMLElement("supplier_id", s.supplier_id) || XMLElement("name", s.name) xml_fragment FROM supplier s; SELECT XMLElement("supplier_id", s.supplier_id) || XMLElement("name", s.name) xml_fragment FROM supplier s; SELECT XMLElement("supplier", XMLElement("supplier_id", s.supplier_id), XMLElement("name", s.name)) FROM supplier s; SELECT XMLElement("supplier", XMLElement("supplier_id", s.supplier_id), XMLElement("name", s.name)) FROM supplier s;

84 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 XMLForest() More simply, can use XMLForest() function: SELECT XMLElement("supplier", XMLForest(s.supplier_id, s.name)) FROM supplier s; SELECT XMLElement("supplier", XMLForest(s.supplier_id, s.name)) FROM supplier s;

85 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 XMLAgg() Can use XMLAgg() to put nodes together inside another node: SELECT XMLElement("supplier_list", XMLAgg(XMLElement("supplier", XMLElement("supplier_id", s.supplier_id), XMLElement("name", s.name) ))) xml_document FROM supplier s; SELECT XMLElement("supplier_list", XMLAgg(XMLElement("supplier", XMLElement("supplier_id", s.supplier_id), XMLElement("name", s.name) ))) xml_document FROM supplier s;

86 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 New topic: Data Warehousing Physical warehouse: stores different kinds of items  combined from different sources in supply chain  access items as a combined package  “Synergy” DW is the sys containing the data from many DBs OLAP is the system for easily querying the DW  Online analytical processing  front-end to DW & stats

87 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Integrating Data Ad hoc combination of DBs from different sources can be problematic Data may be spread across many systems  geographically  by division  different systems from before mergers…

88 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Conversion/scrubbing/merging Lots of issues…  different types of data Varchar(255) v. char(30)  Different values for data ‘GREEN’/’GR/’2  Semantic differences Cars v. Automobiles  Missing values Handle with nulls or XML

89 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Federated DBs Situ: n different DBs must work together One idea: write programs for each to talk to each other one  How many programs required?  Like ambassadors for each country

90 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Federated DBs Better idea: introduce another DB  write programs for it to talk to each other DB Now how many programs?  English in business, French in diplomacy  Warehousing  Refreshed nightly

91 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OLTP v. OLAP DWs usually not updated in real-time  data is usually not live  but care about higher-level, longer-term patterns  For “knowledge workers”/decision-makers Live data is in system used by OLTP  online transaction processing  E.g., airline reservations  OLTP data loaded into DW periodically, say nightly

92 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Utilizing Data Situ: each time manager has hunch   requests custom reports   direct programmers to write/modify SQL app to produce these results  on higher or lower levels, for different specifics Problem: too difficult/expensive/slow  too great a time lag

93 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 EISs Could just write queries at command-prompt But decision makes aren’t (all) SQL programmers Soln: create an executive information system  provides friendly front-end to common, important queries  basically a simple DB front-end  your project part 5 GROUP BY queries are particularly applicable…

94 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 EISs v. OLAP Okay for fixed set of queries But what if queries are open-ended? Q: What’s driving sales in the Northeast?  What’s the source cause?  Result from one query influences next query tried OLAP systems are interactive:  run query  analyze results  think of new query  repeat

95 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Star Schemas Popular schema for DW data One central DB surrounded by specific DBs Center: fact table Extremities: data tables Fields in fact table are foreign keys to data tables Normalization  Snowflake Schema  May not be worthwhile…

96 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Dates and star schemas OLAP behaves as though you had a Days table, with every possible row  Dates(day, week, month, year, DID)  (5, 27, 7, 2000) Can join on Days like any other table

97 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Dates and star schemas E.g.: products x salesperson x region x date  Products sold by salespeople in regions on dates Regular dim tables:  Product(PID, name, color)  Emp(name, SSN, sal)  Region(name, RID) Fact table:  Sales(PID, DID, SSN, RID)  Interpret as a cube (cross product of all dimensions) Can have both data and stats

98 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Drill-down & roll-up Imagine: notice some region’s sales way up Why? Good salesperson? Some popular product there? Maybe need to search by month, or month and product, abstract back up to just product… “slicing & dicing”

99 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 OLAP and data warehousing Could write GROUP BY queries for each OLAP systems provide simpler, non-SQL interface for this sort of thing Vendors: MicroStrategy, SAP, etc. Otoh: DW-style operators have been added to SQL and some DBMSs…

100 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 DW extensions in SQL: ROLLUP (Oracle) Suppose have orders table (from two years), with region and date info: Can select total sales: Examples derived/from Mastering Oracle SQL, 2e (O’Reilly) Get data here: SELECT sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id; SELECT sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id; SQL> column month format a10 SQL> describe all_orders SQL> column month format a10 SQL> describe all_orders

101 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Can write GROUP BY queries for year or region or both: SELECT r.name region, o.year, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY (r.name, o.year); SELECT r.name region, o.year, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY (r.name, o.year); DW extensions in SQL: ROLLUP (Oracle)

102 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 ROLLUP operator  Extension of GROUP BY  Does GROUP BY on several levels, simultaneously  Order matters Get sales totals for each region/year pair each region, and the grand total: SELECT r.name region, o.year, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY ROLLUP (r.name, o.year); SELECT r.name region, o.year, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY ROLLUP (r.name, o.year); DW extensions in SQL: ROLLUP (Oracle)

103 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Change the order of the group fields to get a different sequence of groups To get totals for each year/region pair, each year, and the grand total, and just reverse group-by order: SELECT o.year, r.name region, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY ROLLUP (o.year, r.name); SELECT o.year, r.name region, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY ROLLUP (o.year, r.name); DW extensions in SQL: ROLLUP (Oracle)

104 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Adding more dimensions, like month, is easy (apart from formatting): NB: summing happens on each level SELECT o.year, to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY ROLLUP (o.year, o.month, r.name); SELECT o.year, to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY ROLLUP (o.year, o.month, r.name); DW extensions in SQL: ROLLUP (Oracle)

105 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 If desired, can combine fields for the sake of grouping: DW extensions in SQL: ROLLUP (Oracle) SELECT o.year, to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY ROLLUP ((o.year, o.month), r.name); SELECT o.year, to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY ROLLUP ((o.year, o.month), r.name);

106 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 DW extensions in SQL: CUBE (Oracle) Another GROUP BY extension: CUBE  Subtotals all possible combins of group-by fields (powerset)  Syntax: “ROLLUP”  “CUBE”  Order of fields doesn’t matter (apart from ordering) To get subtotals for each region/month pair, each region, each month, and the grand total: SELECT to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY CUBE (o.month, r.name); SELECT to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY CUBE (o.month, r.name);

107 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 DW extensions in SQL: CUBE (Oracle) Again, can easily add more dimensions: SELECT o.year, to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY CUBE (o.year, o.month, r.name); SELECT o.year, to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY CUBE (o.year, o.month, r.name);

108 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 DW SQL exts: GROUPING SETS (Oracle) That’s a lot of rows Instead of a cube of all combinations, maybe we just want the totals for each individual field: SELECT o.year, to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY GROUPING SETS (o.year, o.month, r.name); SELECT o.year, to_char(to_date(o.month, 'MM'),'Month') month, r.name region, sum(o.tot_sales) FROM all_orders o join region r ON r.region_id = o.region_id GROUP BY GROUPING SETS (o.year, o.month, r.name);

109 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 Next Final evals More lab…

110 Matthew P. Johnson, OCL3, CISDD CUNY, June 2005 That’s all, folks! Selected solutions to exercises: sqlzoo ~ “Answers” on sqlzoo.net PL/SQL ~ archive/fall04/plsql/ archive/fall04/plsql/ mpjohnson-at-gmail.com