M.P. Johnson, DBMS, Stern/NYU, Spring 20051 C20.0046: Database Management Systems Lecture #24 M.P. Johnson Stern School of Business, NYU Spring, 2005.

Slides:



Advertisements
Similar presentations
XML-XSL Introduction SHIJU RAJAN SHIJU RAJAN Outline Brief Overview Brief Overview What is XML? What is XML? Well Formed XML Well Formed XML Tag Name.
Advertisements

XML: Extensible Markup Language
XML 6.3 DTD 6. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:  Elements.
CSE 190: Internet E-Commerce Lecture 17: XML, XSL.
1 Lecture 10 XML Wednesday, October 18, XML Outline XML (4.6, 4.7) –Syntax –Semistructured data –DTDs.
M.P. Johnson, DBMS, Stern/NYU, Sp20041 C : Database Management Systems Lecture #23 Matthew P. Johnson Stern School of Business, NYU Spring, 2004.
Creating a Well-Formed Valid Document. 2 Objectives Introducing XHTML Creating a Well-Formed Document Creating a Valid Document Creating an XHTML Document.
Semi-structured Data. Facts about the Web Growing fast Popular Semi-structured data –Data is presented for ‘human’-processing –Data is often ‘self-describing’
XML Introduction What is XML –XML is the eXtensible Markup Language –Became a W3C Recommendation in 1998 –Tag-based syntax, like HTML –You get to make.
Sistemi basati su conoscenza XML Prof. M.T. PAZIENZA a.a
XML(EXtensible Markup Language). XML XML stands for EXtensible Markup Language. XML is a markup language much like HTML. XML was designed to describe.
1 Lecture 08: XML and Semistructured Data. 2 Outline XML (Section 17) –XML syntax, semistructured data –Document Type Definitions (DTDs) XPath.
1 Lecture 08: XML and Semistructured Data. 2 Outline XML (Section 17) –XML syntax, semistructured data –Document Type Definitions (DTDs) XPath.
Document Type Definitions. XML and DTDs A DTD (Document Type Definition) describes the structure of one or more XML documents. Specifically, a DTD describes:
Introduction to XML This material is based heavily on the tutorial by the same name at
4/20/2017.
Marco Mesiti Dep. of Computer Science University of Genova XML eXtensible Markup Language.
Creating a Basic Web Page
XML introduction to Ahmed I. Deeb Dr. Anwar Mousa  presenter  instructor University Of Palestine-2009.
XP Tutorial 9New Perspectives on Creating Web Pages with HTML, XHTML, and XML 1 Working with XHTML Creating a Well-Formed Valid Document Tutorial 9.
XML eXtensible Markup Language by Darrell Payne. Experience Logicon / Sterling Federal C, C++, JavaScript/Jscript, Shell Script, Perl XML Training XML.
1Computer Sciences Department Princess Nourah bint Abdulrahman University.
XML: Overview MIS 181.9: Service Oriented Architecture 2 nd Semester,
CIT 383: Administrative ScriptingSlide #1 CIT 383: Administrative Scripting XML.
XML Overview. Chapter 8 © 2011 Pearson Education 2 Extensible Markup Language (XML) A text-based markup language (like HTML) A text-based markup language.
XML What is XML? XML v.s. HTML XML Components Well-formed and Valid Document Type Definition (DTD) Extensible Style Language (XSL) SAX and DOM.
1 © Netskills Quality Internet Training, University of Newcastle Introducing XML © Netskills, Quality Internet Training University.
XML 1 Enterprise Applications CE00465-M XML. 2 Enterprise Applications CE00465-M XML Overview Extensible Mark-up Language (XML) is a meta-language that.
XHTML. Introduction to XHTML What Is XHTML? – XHTML stands for EXtensible HyperText Markup Language – XHTML is almost identical to HTML 4.01 – XHTML is.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
FIGIS’ML Hands-on training - © FAO/FIGIS An introduction to XML Objectives : –what is XML? –XML and HTML –XML documents structure well-formedness.
 XML is designed to describe data and to focus on what data is. HTML is designed to display data and to focus on how data looks.  XML is created to structure,
Winter 2006Keller, Ullman, Cushing18–1 Plan 1.Information integration: important new application that motivates what follows. 2.Semistructured data: a.
How do I use HTML and XML to present information?.
1 Chapter 10: XML What is XML What is XML Basic Components of XML Basic Components of XML XPath XPath XQuery XQuery.
XP Tutorial 9 1 Working with XHTML. XP SGML 2 Standard Generalized Markup Language (SGML) A standard for specifying markup languages. Large, complex standard.
Softsmith Infotech XML. Softsmith Infotech XML EXtensible Markup Language XML is a markup language much like HTML Designed to carry data, not to display.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation An Introduction to XML.
XML Documents Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Elements Attributes Comments PI Document.
WEB APPLICATION DEVELOPMENT For More visit:
Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies.
XML Instructor: Charles Moen CSCI/CINF XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.
17 Apr 2002 XML Syntax: Documents Andy Clark. Basic Document Structure Element tags – Elements have associated attributes Text content Miscellaneous –
Lecture 16 Introduction to XML Boriana Koleva Room: C54
XML EXtensible Markup Language. Agenda Introduction to XML XML Rules XML Elements XML Attributes XML Validation XML Exercises XML Namespaces XML CDATA.
Chapter 27 The World Wide Web and XML. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.27-2 Topics in this Chapter The Web and the Internet.
An Introduction to XML Sandeep Bhattaram
What it is and how it works
Lecture: Web Design Assis. Prof. Freshta Hanif Ehsan Faculty of Computer Science Kabul Polytechnic University Spring Semester
XML Basics A brief introduction to XML in general 1XML Basics.
CS 157B: Database Management Systems II February 11 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
When we create.rtf document apart from saving the actual info the tool saves additional info like start of a paragraph, bold, size of the font.. Etc. This.
What is XML? eXtensible Markup Language eXtensible Markup Language A subset of SGML (Standard Generalized Markup Language) A subset of SGML (Standard Generalized.
Introduction to DTD A Document Type Definition (DTD) defines the legal building blocks of an XML document. It defines the document structure with a list.
XML CSC1310 Fall HTML (TIM BERNERS-LEE) HyperText Markup Language  HTML (HyperText Markup Language): December  Markup  Markup is a symbol.
XP Tutorial 9New Perspectives on HTML and XHTML, Comprehensive 1 Working with XHTML Creating a Well-Formed Valid Document Tutorial 9.
XML Technology. Emerging Importance of XML –HTML-tagging is display oriented. –XML-based content tagging has important uses: data mining role-oriented.
Information Architecture 2 Primary Readings - Designing With Web Standards: Chapters 5-8 Designing With Web Standards Deliverables - Design Critiques due.
Basic HTML Document Structure. Slide 2 Goals (XHTML HTML5) XHTML Separate document structure and content from document formatting HTML 5 Create a formal.
Jackson, Web Technologies: A Computer Science Perspective, © 2007 Prentice-Hall, Inc. All rights reserved Chapter 7 Representing Web Data:
SEMI-STRUCTURED DATA (XML) 1. SEMI-STRUCTURED DATA ER, Relational, ODL data models are all based on schema Structure of data is rigid and known is advance.
XML Introduction to XML Extensible Markup Language.
XML Notes taken from w3schools. What is XML? XML stands for EXtensible Markup Language. XML was designed to store and transport data. XML was designed.
Extensible Markup Language (XML) Pat Morin COMP 2405.
XML BASICS and more…. What is XML? In common:  XML is a standard, simple, self-describing way of encoding both text and data so that content can be processed.
XML: Extensible Markup Language
Unit 4 Representing Web Data: XML
Chapter 7 Representing Web Data: XML
Lecture 9: XML Monday, October 17, 2005.
Presentation transcript:

M.P. Johnson, DBMS, Stern/NYU, Spring C : Database Management Systems Lecture #24 M.P. Johnson Stern School of Business, NYU Spring, 2005

M.P. Johnson, DBMS, Stern/NYU, Spring Homework Project part 5  Topic: web interface + any remaining loose ends  Up now  Due: end of semester  Run, don’t walk  Important: if you use data you from someone else (e.g., from the web), this should be visibly cited on your site Hw3 is up  optional

M.P. Johnson, DBMS, Stern/NYU, Spring Agenda Injection attack prevention in Perl XML

M.P. Johnson, DBMS, Stern/NYU, Spring Goals After Today:  Know how to prevent injection attacks in Perl  Know something about XML..

M.P. Johnson, DBMS, Stern/NYU, Spring Review: Why security is hard It’s a “negative deliverable” It’s an asymmetric threat  “Remember, there are 1000 warheads unaccounted for. Marwan only needs one.” – Jack Bauer Tolstoy: “Happy families are all alike; every unhappy family is unhappy in its own way.”  Analogs: “homeland”, jails, debugging, proof-reading, Popperian science, fishing, MC algs So: fix biggest problems first

M.P. Johnson, DBMS, Stern/NYU, Spring Injection attacks – MySQL/Perl/PHP Consider another input:  user: your-boss  pass: ' OR 1=1 OR pass = '  SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = u AND password = p; SELECT * FROM users WHERE user = 'your-boss' AND password = ' ' OR 1=1 OR pass = ' '; SELECT * FROM users WHERE user = 'your-boss' AND password = ' ' OR 1=1 OR pass = ' '; SELECT * FROM users WHERE user = 'your-boss' AND password = '' OR 1=1 OR pass = ''; SELECT * FROM users WHERE user = 'your-boss' AND password = '' OR 1=1 OR pass = '';

M.P. Johnson, DBMS, Stern/NYU, Spring Injection attacks – MySQL/Perl/PHP Consider another input:  user: your-boss  pass: ' OR 1=1 AND user = 'your-boss  Delete your boss! DELETE FROM users WHERE user = u AND password = p; DELETE FROM users WHERE user = u AND password = p; DELETE FROM users WHERE user = 'your-boss' AND pass = ' ' OR 1=1 AND user = ' your-boss'; DELETE FROM users WHERE user = 'your-boss' AND pass = ' ' OR 1=1 AND user = ' your-boss'; DELETE FROM users WHERE user = 'your-boss' AND pass = '' OR 1=1 AND user = 'your-boss'; DELETE FROM users WHERE user = 'your-boss' AND pass = '' OR 1=1 AND user = 'your-boss';

M.P. Johnson, DBMS, Stern/NYU, Spring Preventing injection attacks Ultimate source of problem: quotes Soln 1: don’t allow quotes!  Reject any entered data containing single quotes Q: Is this satisfactory?  Does Amazon need to sell O’Reilly books? Soln 2: escape any single quotes  Replace any ' with a '' or \'  In Perl, use taint mode – won’t show  In PHP, turn on magic_quotes_gpc flag in.htaccess show both PHP versions

M.P. Johnson, DBMS, Stern/NYU, Spring Preventing injection attacks Soln 3: use prepare parameter-based queries  Supported in JDBC, Perl DBI, PHP ext/mysqli   Very dangerous: using tainted data to run commands at the Unix command prompt  Semi-colons, prime char, etc.  Safest: define set if legal chars, not illegal ones

M.P. Johnson, DBMS, Stern/NYU, Spring Review: secure hashing We store hashed passwords instead of the passwords themselves. Why? Shouldn’t the hashed passwords still be secret?

M.P. Johnson, DBMS, Stern/NYU, Spring And now for something completely different: XML XML: eXtensible Mark-up Language Very popular language for semi-structured data Mark-up language: consists of elements composed of tags, like HTML Emerging lingua franca of the Internet, Web Services, inter-vender comm

M.P. Johnson, DBMS, Stern/NYU, Spring Unstructured data At one end of continuum: unstructured data  Text files  Stock market prices  CIA intelligence intercepts  Audio recordings  “Just one damn bit after another” Churchill? Henry Ford? No (intentional, formal) patterns to the data Difficult to manage/make sense of  Why we need data-mining

M.P. Johnson, DBMS, Stern/NYU, Spring Structured data At the other end: structured data  Tables in RDBMSs  Data organized into semantic chunks entities  Similar/related entities grouped together Relationships, classes  Entities in same group have same structure Same fields/attributes/properties Easy to make sense of  But sometimes too rigid a req.  Difficult to send—convert to tab-delimited

M.P. Johnson, DBMS, Stern/NYU, Spring Semi-structured data Not too random  Data organized into entities  Similar/related grouped to form other entities Not too structured  Some attributes may be missing  Size of attributes may vary Support of lists/sets Juuust Right  Data is self-describing

M.P. Johnson, DBMS, Stern/NYU, Spring Semi-structured data Predominant examples:  HTML: HyperText Mark-up Language  XML: eXtensible Mark-up Language NB: both mark-up languages (use tags) Mark-up lends self of semi-structured data  Demarcate boundaries for entities  But freely allow other entities inside

M.P. Johnson, DBMS, Stern/NYU, Spring Data model for semi-structured data Usually represented as directed graphs Graph: set of vertices (nodes) and edges  Dots connected by lines; not nec. a tree! In model,  Nodes ~ entities or fields/attributes  Edges ~ attribute-of/sub-entity-of Example: publisher publishes >=0 books  Each book has one title, one year, >=1 authors  Draw publishers graph

M.P. Johnson, DBMS, Stern/NYU, Spring XML is a SSD language Standard published by W3C  Officially announced/recommended in 1998 XML != HTML  XML != a replacement for HTML  Both are mark-up languages Big diffs:  XML doesn’t use predefined tags (!) But it’s extensible: tags can be added  HTML is about presentation:,, XML is about content:,

M.P. Johnson, DBMS, Stern/NYU, Spring XML syntax Like HTML in many respects but more strict All tags must be closed  Can’t have: this is a line  Every start tag has an end tag  Although style can replace both IS case-sensitive IS space-sensitive XML doc has a unique root element

M.P. Johnson, DBMS, Stern/NYU, Spring XML syntax Tags must be properly nested  Not allowed I’m not kidding  Intuition: file folders Elements may have quoted attributes  … Comments same as in HTML:  Draw publishers XML

M.P. Johnson, DBMS, Stern/NYU, Spring Escape chars in XML Some chars must be escaped  Distinguish content from syntax Can also declare value to be pure text: >< <> && "" '&apos; jsdljsd <>>]]> 3 < 5 "Don&apos;t call me &apos;Ishmael&apos;!"

M.P. Johnson, DBMS, Stern/NYU, Spring XML Namespaces Different schemas/DTDs may overlap  XHTML and MathML share some tags Soln: namespaces  as in Java/C++/C#

M.P. Johnson, DBMS, Stern/NYU, Spring Michael 123 Hilary 456 Bill 789 Michael 123 Hilary 456 Bill 789 row name ssn “Michael”123“Hilary”“Bill” persons XML: persons From Relational Data to XML Data NameSSNMailing-address Michael123NY Hilary456DC Bill789Chappaqua

M.P. Johnson, DBMS, Stern/NYU, Spring Semi-structured Data Explained List-valued attributes  XML is not 1NF! Impossible in (single, BCNF) tables:  two phones! namephone Bill ??? Hilary Bill Hilary Bill

M.P. Johnson, DBMS, Stern/NYU, Spring Object ids and References SSD graph might not be trees! But XML docs must be Would cause much redundancy Soln: same concept as pointers in C/C++/J  Object ids and references Graph example:  Movies: Lost in Translation, Hamlet  Stars: Bill Murray, Scarlet Johansson Lost in Translation 2003 Hamlet 1999 Bill Murray Lost in Translation 2003 Hamlet 1999 Bill Murray

M.P. Johnson, DBMS, Stern/NYU, Spring What do we do with XML? Things done with XML:  Send to partners  Parse XML received  Convert to RDBMS rows  Query for particular data  Convert to other XML  Convert to formats other than XML Lots of tools/standards for these…

M.P. Johnson, DBMS, Stern/NYU, Spring DTDs & understanding XML XML is extensible Advantage: when creating, we can use any tags we like Disadv: when reading, they can use any tags they like  Using XML docs a priori is very difficult Solution: impose some constraints

M.P. Johnson, DBMS, Stern/NYU, Spring DTDs DTD: Document Type Definition You and partners/vertical industry/academic discipline decide on a DTD/schema for your docs  Specify which entities you may use/must understand  Specify legal relationships DTD specifies the grammar to be used  DTD = set of rules for creating valid entities DTD tells your software what to look for in doc

M.P. Johnson, DBMS, Stern/NYU, Spring DTD examples Well-formed XML v. valid XML Simple example:     Copy from: Partial publisher example rules:  Root  publisher  Publisher  name, book*, author*  Book  title, date, author+  Author  firstname, middlename?, lastname

M.P. Johnson, DBMS, Stern/NYU, Spring Partial DTD example (typos!) <!DOCTYPE PUBLISHER [ <!DOCTYPE PUBLISHER [ DTD is not XML, but can be embedded in or ref.ed from XML Replacement for DTDs is XML Schema

M.P. Johnson, DBMS, Stern/NYU, Spring XML Applications/dialects MathML: Mathematical Markup Language  ations/ictp99/ictp99N8059.html ations/ictp99/ictp99N8059.html VoiceXML: es/rps.xml es/rps.xml ChemML: Chemical Markup Language XHMTL: HTML retrofitted as an XML application

M.P. Johnson, DBMS, Stern/NYU, Spring XML Applications/dialects Copy from: MathML: Mathematical Markup Language  99/ictp99N8059.html 99/ictp99N8059.html ChemML: Chemical Markup Language X4ML: XML for Merrill Lynch XHMTL: HTML retrofitted as an XML application  Validation:

M.P. Johnson, DBMS, Stern/NYU, Spring XML Applications/dialects VoiceXML:   AT&T Directory Assistance  Image from

M.P. Johnson, DBMS, Stern/NYU, Spring More XML Apps FIXML  XML equiv. of FIX: Financial Information eXchange swiftML  XML equiv. of SWIFT: Society for Worldwide Interbank Financial Telecommunications message format Apache’s Ant  Scripting language for Java build management  Many more: 

M.P. Johnson, DBMS, Stern/NYU, Spring More XML Applications/Protocols RSS: Rich Site Summary/Really Simple Syndication  News sites, blogs…    Screenshot  More info: my channel story 1 … // other items my channel story 1 … // other items

M.P. Johnson, DBMS, Stern/NYU, Spring More XML Applications/Protocols SOAP: Simple Object Access Protocol  XML-based messaging format  Used by Google API:  Amazon API:  Amazon light:  Other examples: 10&topic=&topic_set= 10&topic=&topic_set SOAP envelope with header and body  Request sales tax for total <SOAP:Envelope xmlns:SOAP="urn:schemas-xmlsoap-org:soap.v1"> 100 <SOAP:Envelope xmlns:SOAP="urn:schemas-xmlsoap-org:soap.v1"> 100

M.P. Johnson, DBMS, Stern/NYU, Spring More XML Applications/Protocols %(key)s 0 10 true false %(key)s 0 10 true false