M.P. Johnson, DBMS, Stern/NYU, Spring C : Database Management Systems Lecture #24 M.P. Johnson Stern School of Business, NYU Spring, 2005
M.P. Johnson, DBMS, Stern/NYU, Spring Homework Project part 5 Topic: web interface + any remaining loose ends Up now Due: end of semester Run, don’t walk Important: if you use data you from someone else (e.g., from the web), this should be visibly cited on your site Hw3 is up optional
M.P. Johnson, DBMS, Stern/NYU, Spring Agenda Injection attack prevention in Perl XML
M.P. Johnson, DBMS, Stern/NYU, Spring Goals After Today: Know how to prevent injection attacks in Perl Know something about XML..
M.P. Johnson, DBMS, Stern/NYU, Spring Review: Why security is hard It’s a “negative deliverable” It’s an asymmetric threat “Remember, there are 1000 warheads unaccounted for. Marwan only needs one.” – Jack Bauer Tolstoy: “Happy families are all alike; every unhappy family is unhappy in its own way.” Analogs: “homeland”, jails, debugging, proof-reading, Popperian science, fishing, MC algs So: fix biggest problems first
M.P. Johnson, DBMS, Stern/NYU, Spring Injection attacks – MySQL/Perl/PHP Consider another input: user: your-boss pass: ' OR 1=1 OR pass = ' SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = 'your-boss' AND password = ' ' OR 1=1 OR pass = ' '; SELECT * FROM users WHERE user = 'your-boss' AND password = ' ' OR 1=1 OR pass = ' '; SELECT * FROM users WHERE user = 'your-boss' AND password = '' OR 1=1 OR pass = ''; SELECT * FROM users WHERE user = 'your-boss' AND password = '' OR 1=1 OR pass = '';
M.P. Johnson, DBMS, Stern/NYU, Spring Injection attacks – MySQL/Perl/PHP Consider another input: user: your-boss pass: ' OR 1=1 AND user = 'your-boss Delete your boss! DELETE FROM users WHERE user = u AND password = p; DELETE FROM users WHERE user = u AND password = p; DELETE FROM users WHERE user = 'your-boss' AND pass = ' ' OR 1=1 AND user = ' your-boss'; DELETE FROM users WHERE user = 'your-boss' AND pass = ' ' OR 1=1 AND user = ' your-boss'; DELETE FROM users WHERE user = 'your-boss' AND pass = '' OR 1=1 AND user = 'your-boss'; DELETE FROM users WHERE user = 'your-boss' AND pass = '' OR 1=1 AND user = 'your-boss';
M.P. Johnson, DBMS, Stern/NYU, Spring Preventing injection attacks Ultimate source of problem: quotes Soln 1: don’t allow quotes! Reject any entered data containing single quotes Q: Is this satisfactory? Does Amazon need to sell O’Reilly books? Soln 2: escape any single quotes Replace any ' with a '' or \' In Perl, use taint mode – won’t show In PHP, turn on magic_quotes_gpc flag in.htaccess show both PHP versions
M.P. Johnson, DBMS, Stern/NYU, Spring Preventing injection attacks Soln 3: use prepare parameter-based queries Supported in JDBC, Perl DBI, PHP ext/mysqli Very dangerous: using tainted data to run commands at the Unix command prompt Semi-colons, prime char, etc. Safest: define set if legal chars, not illegal ones
M.P. Johnson, DBMS, Stern/NYU, Spring Review: secure hashing We store hashed passwords instead of the passwords themselves. Why? Shouldn’t the hashed passwords still be secret?
M.P. Johnson, DBMS, Stern/NYU, Spring And now for something completely different: XML XML: eXtensible Mark-up Language Very popular language for semi-structured data Mark-up language: consists of elements composed of tags, like HTML Emerging lingua franca of the Internet, Web Services, inter-vender comm
M.P. Johnson, DBMS, Stern/NYU, Spring Unstructured data At one end of continuum: unstructured data Text files Stock market prices CIA intelligence intercepts Audio recordings “Just one damn bit after another” Churchill? Henry Ford? No (intentional, formal) patterns to the data Difficult to manage/make sense of Why we need data-mining
M.P. Johnson, DBMS, Stern/NYU, Spring Structured data At the other end: structured data Tables in RDBMSs Data organized into semantic chunks entities Similar/related entities grouped together Relationships, classes Entities in same group have same structure Same fields/attributes/properties Easy to make sense of But sometimes too rigid a req. Difficult to send—convert to tab-delimited
M.P. Johnson, DBMS, Stern/NYU, Spring Semi-structured data Not too random Data organized into entities Similar/related grouped to form other entities Not too structured Some attributes may be missing Size of attributes may vary Support of lists/sets Juuust Right Data is self-describing
M.P. Johnson, DBMS, Stern/NYU, Spring Semi-structured data Predominant examples: HTML: HyperText Mark-up Language XML: eXtensible Mark-up Language NB: both mark-up languages (use tags) Mark-up lends self of semi-structured data Demarcate boundaries for entities But freely allow other entities inside
M.P. Johnson, DBMS, Stern/NYU, Spring Data model for semi-structured data Usually represented as directed graphs Graph: set of vertices (nodes) and edges Dots connected by lines; not nec. a tree! In model, Nodes ~ entities or fields/attributes Edges ~ attribute-of/sub-entity-of Example: publisher publishes >=0 books Each book has one title, one year, >=1 authors Draw publishers graph
M.P. Johnson, DBMS, Stern/NYU, Spring XML is a SSD language Standard published by W3C Officially announced/recommended in 1998 XML != HTML XML != a replacement for HTML Both are mark-up languages Big diffs: XML doesn’t use predefined tags (!) But it’s extensible: tags can be added HTML is about presentation:,, XML is about content:,
M.P. Johnson, DBMS, Stern/NYU, Spring XML syntax Like HTML in many respects but more strict All tags must be closed Can’t have: this is a line Every start tag has an end tag Although style can replace both IS case-sensitive IS space-sensitive XML doc has a unique root element
M.P. Johnson, DBMS, Stern/NYU, Spring XML syntax Tags must be properly nested Not allowed I’m not kidding Intuition: file folders Elements may have quoted attributes … Comments same as in HTML: Draw publishers XML
M.P. Johnson, DBMS, Stern/NYU, Spring Escape chars in XML Some chars must be escaped Distinguish content from syntax Can also declare value to be pure text: >< <> && "" '' jsdljsd <>>]]> 3 < 5 "Don't call me 'Ishmael'!"
M.P. Johnson, DBMS, Stern/NYU, Spring XML Namespaces Different schemas/DTDs may overlap XHTML and MathML share some tags Soln: namespaces as in Java/C++/C#
M.P. Johnson, DBMS, Stern/NYU, Spring Michael 123 Hilary 456 Bill 789 Michael 123 Hilary 456 Bill 789 row name ssn “Michael”123“Hilary”“Bill” persons XML: persons From Relational Data to XML Data NameSSNMailing-address Michael123NY Hilary456DC Bill789Chappaqua
M.P. Johnson, DBMS, Stern/NYU, Spring Semi-structured Data Explained List-valued attributes XML is not 1NF! Impossible in (single, BCNF) tables: two phones! namephone Bill ??? Hilary Bill Hilary Bill
M.P. Johnson, DBMS, Stern/NYU, Spring Object ids and References SSD graph might not be trees! But XML docs must be Would cause much redundancy Soln: same concept as pointers in C/C++/J Object ids and references Graph example: Movies: Lost in Translation, Hamlet Stars: Bill Murray, Scarlet Johansson Lost in Translation 2003 Hamlet 1999 Bill Murray Lost in Translation 2003 Hamlet 1999 Bill Murray
M.P. Johnson, DBMS, Stern/NYU, Spring What do we do with XML? Things done with XML: Send to partners Parse XML received Convert to RDBMS rows Query for particular data Convert to other XML Convert to formats other than XML Lots of tools/standards for these…
M.P. Johnson, DBMS, Stern/NYU, Spring DTDs & understanding XML XML is extensible Advantage: when creating, we can use any tags we like Disadv: when reading, they can use any tags they like Using XML docs a priori is very difficult Solution: impose some constraints
M.P. Johnson, DBMS, Stern/NYU, Spring DTDs DTD: Document Type Definition You and partners/vertical industry/academic discipline decide on a DTD/schema for your docs Specify which entities you may use/must understand Specify legal relationships DTD specifies the grammar to be used DTD = set of rules for creating valid entities DTD tells your software what to look for in doc
M.P. Johnson, DBMS, Stern/NYU, Spring DTD examples Well-formed XML v. valid XML Simple example: Copy from: Partial publisher example rules: Root publisher Publisher name, book*, author* Book title, date, author+ Author firstname, middlename?, lastname
M.P. Johnson, DBMS, Stern/NYU, Spring Partial DTD example (typos!) <!DOCTYPE PUBLISHER [ <!DOCTYPE PUBLISHER [ DTD is not XML, but can be embedded in or ref.ed from XML Replacement for DTDs is XML Schema
M.P. Johnson, DBMS, Stern/NYU, Spring XML Applications/dialects MathML: Mathematical Markup Language ations/ictp99/ictp99N8059.html ations/ictp99/ictp99N8059.html VoiceXML: es/rps.xml es/rps.xml ChemML: Chemical Markup Language XHMTL: HTML retrofitted as an XML application
M.P. Johnson, DBMS, Stern/NYU, Spring XML Applications/dialects Copy from: MathML: Mathematical Markup Language 99/ictp99N8059.html 99/ictp99N8059.html ChemML: Chemical Markup Language X4ML: XML for Merrill Lynch XHMTL: HTML retrofitted as an XML application Validation:
M.P. Johnson, DBMS, Stern/NYU, Spring XML Applications/dialects VoiceXML: AT&T Directory Assistance Image from
M.P. Johnson, DBMS, Stern/NYU, Spring More XML Apps FIXML XML equiv. of FIX: Financial Information eXchange swiftML XML equiv. of SWIFT: Society for Worldwide Interbank Financial Telecommunications message format Apache’s Ant Scripting language for Java build management Many more:
M.P. Johnson, DBMS, Stern/NYU, Spring More XML Applications/Protocols RSS: Rich Site Summary/Really Simple Syndication News sites, blogs… Screenshot More info: my channel story 1 … // other items my channel story 1 … // other items
M.P. Johnson, DBMS, Stern/NYU, Spring More XML Applications/Protocols SOAP: Simple Object Access Protocol XML-based messaging format Used by Google API: Amazon API: Amazon light: Other examples: 10&topic=&topic_set= 10&topic=&topic_set SOAP envelope with header and body Request sales tax for total <SOAP:Envelope xmlns:SOAP="urn:schemas-xmlsoap-org:soap.v1"> 100 <SOAP:Envelope xmlns:SOAP="urn:schemas-xmlsoap-org:soap.v1"> 100
M.P. Johnson, DBMS, Stern/NYU, Spring More XML Applications/Protocols %(key)s 0 10 true false %(key)s 0 10 true false