Download presentation
Presentation is loading. Please wait.
Published byLydia Allison Modified over 8 years ago
1
Web Servers, Data Transmission and Exchange Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems January 29, 2008
2
2 Today Finish discussion of Web servers Communications: Sending data Physical vs. logical representation Encoding and management of heterogeneity
3
3 Authentication and Authorization Authentication At minimum, user ID and password – authenticates requestor Client may wish to authenticate the server, too! SSL (we’ll discuss this more later) Part of SSL: certificate from trusted server, validating machine Also: public key for encrypting client’s transmissions Authorization Determine what user can access For files, applications: typically, access control list If data from database, may also have view-based security
4
4 Programming Support in Web Servers CGI – Common Gateway Interface – the oldest: A CGI is a separate program, often in Perl, invoked by the server Certain info is passed from server to CGI via Unix-style environment variables QUERY_STRING; REMOTE_HOST, CONTENT_TYPE, … HTTP post data is read from stdin Interface to persistent process: In essence, how communication with a database is done – Oracle or MySQL is running “on the side” Communicate via pipes, APIs like ODBC/JDBC, etc. Server module running in the same process Might be custom code (e.g., Apache extension) or an interpreter/runtime system…
5
5 Server Modules Interpreters: JavaScript/JScript, PHP, ASP, … Often a full-fledged programming language Code is generally embedded within HTML, not stand-alone Custom runtimes/virtual machines: Most modern Perl runtimes; Java servlets; ASP.NET A virtual machine runs within the web server process Functions are invoked within that JVM to handle each request Code is generally written as usual, but may need to use HTML to create UI rather than standard GUI APIs Most of these provide (at least limited) protection mechanisms
6
6 Servlets An interesting model for programming applications in Java A servlet is a subclass of HttpServlet It overrides methods doGet() or doPost() It’s given a number of objects: HttpServletRequest (includes info about parameters, browser, etc.), HttpServletResponse (a means for sending info back to the browser, including data, forwarding requests, etc.) There’s a notion of a session that can be used to share state across doGet()/doPost() invocations – it’s generally connected with a cookie Those of you who took CSE 330/CIS 550 should be generally familiar with servlets Those who didn’t should be able to catch up by looking at, e.g., http://www.apl.jhu.edu/~hall/java/Servlet-Tutorial/ http://www.apl.jhu.edu/~hall/java/Servlet-Tutorial/ http://www.novocode.com/doc/servlet-essentials/ http://www.novocode.com/doc/servlet-essentials/ Your homework assignment will be to build a simple servlet engine a la Tomcat
7
7 (Cross-)Session State: Cookies Major problem with sessionless nature of HTTP: how do we keep info between connections? Cookie: an opaque string associated with a web site, stored at the browser Create in HTTP response with “ Set-Cookie: xxx ” Passed in HTTP header as “ Cookie: xxx ” Interpretation is up to the application Usually, object-value pairs; passed in HTTP header: Cookie: user=“Joe” pwd=“blob” … Often have an expiration Very common: “session cookies”
8
8 Persistent State: Interfacing with a Database A very common operation: Read some data from a database, output in a web form e.g., postings on Slashdot, items for a product catalog, etc. Three problems, abstracted away by ODBC/ADO/JDBC: Impedance mismatch from relational DBs to objects in Java (etc.) Standard API for different databases Physical implementation for each DB
9
9 Going One Step Further Today, data doesn’t just come from databases Web services, e.g., Amazon or corporate intranet services External entities like credit card companies, shippers Web pages Etc.
10
10 Sending Data How do we send data within a program? What is the implicit model? How does this change when we need to make the data persistent? What happens when we are coupling systems? How do we send data between programs on the same machine? Between different machines?
11
11 Marshalling Converting from an in-memory data structure to something that can be sent elsewhere Pointers -> something else Specific byte orderings Metadata Note that the same logical data gets a different physical encoding A specific case of Codd’s idea of logical-physical separation “Data model” vs. “data”
12
12 Communication and Streams When storing data to disk, we have a combination of sequential and random access When sending data on “the wire”, data is only sequential “Stream-based communication” based on packets What are the implications here? Pipelining, incremental evaluation, …
13
13 Why Data Interchange Is Hard Need to be able to understand: Data encoding (physical data model) May have syntactic heterogeneity Endian-ness, marshalling issues Impedance mismatches Data representation (logical data model) May have semantic heterogeneity Imprecise and ambiguous values/descriptions
14
14 Examples MP3 ID3 format – record at end of file offsetlengthdescription 03"TAG" identifier string. 330Song title string. 3330Artist string. 6330Album string. 934Year string. 9728Comment string. 1251Zero byte separator. 1261Track byte. 1271Genre byte.
15
15 Examples JPEG “JFIF” header: Start of Image (SOI) marker -- two bytes (FFD8) JFIF marker (FFE0) length -- two bytes identifier -- five bytes: 4A, 46, 49, 46, 00 (the ASCII code equivalent of a zero terminated "JFIF" string) version -- two bytes: often 01, 02 the most significant byte is used for major revisions the least significant byte for minor revisions units -- one byte: Units for the X and Y densities 0 => no units, X and Y specify the pixel aspect ratio 1 => X and Y are dots per inch 2 => X and Y are dots per cm X density -- two bytes Y density -- two bytes X thumbnail -- one byte: 0 = no thumbnail Y thumbnail -- one byte: 0 = no thumbnail (RGB)n -- 3n bytes: packed (24-bit) RGB values for the thumbnail pixels, n = X thumbnail * Y thumbnail
16
16 Finding File Formats http://www.wikipedia.org/ http://www.wikipedia.org/ http://www.wotsit.org/ http://www.wotsit.org/ etc.
17
17 The Problem You need to look into a manual to find file formats (At best, e.g., MS.DOC file format) The Web is about making data exchange easier… Maybe we can do better! “The mother of all file formats”
18
18 Desiderata for Data Interchange Ability to represent many kinds of information Different data structures Hardware-independent encoding Endian-ness, UTF vs. ASCII vs. EBCDIC Standard tools and interfaces Ability to define “shape” of expected data With forwards- and backwards-compatibility! That’s XML…
19
19 Consumers of XML A myriad of tools and interfaces, including: DOM – document object model Standard OO representation of an XML tree SAX – simple API for XML An event-driven parser interface for XML startElement, endElement, etc. Ant – Java-based “make” tool with XML “makefile” XPath, XQuery, XSL, XSLT Web service standards Anything AJAX (“mash-ups”)
20
20 XML as a Data Model XML “information set” includes 7 types of nodes: Document (root) Element Attribute Processing instruction Text (content) Namespace: Comment XML data model includes this, plus typing info, plus order info and a few other things
21
21 Example XML Document Kurt P. Brown PRPL: A Database Workload Specification Language 1992 Univ. of Wisconsin-Madison Paul R. McJones The 1995 SQL Reunion Digital System Research Center Report SRC1997-018 1997 db/labs/dec/SRC1997-018.html http://www.mcjones.org/System_R/SQL_Reunion_95/ Processing Instr. Element Attribute Close-tag Open-tag
22
22 XML Data Model Visualized (~ Document Object Model) Root ?xml dblp mastersthesis article mdate key authortitleyearschool editortitleyearjournalvolumeee mdate key 2002… ms/Brown92 Kurt P…. PRPL… 1992 Univ…. 2002… tr/dec/… Paul R. The… Digital… SRC… 1997 db/labs/dec http://www. attribute root p-i element text
23
23 A Few Common Uses of XML Serves as an extensible HTML Allows custom tags (e.g., used by MS Word, openoffice) Supplement it with stylesheets (XSL) to define formatting Provides an exchange format for data (still need to agree on terminology) Tables, objects, etc. Format for marshalling and unmarshalling data in Web Services
24
24 XML as a Super-HTML (MS Word) CIS 550: Database and Information Systems Fall 2003 311 Towne, Tuesday/Thursday 1:30PM – 3:00PM
25
25 XML Easily Encodes Relations idcoursegrade 1330-f03B 23455-s04A 1 330-f03 B 23 455-s04 A Student-course-grade
26
26 It Also Encodes Objects (with Pointers Represented as IDs) Programming Joan Jill www…. …
27
27 XML and Code Web Services (.NET, Java web service toolkits) are using XML to pass parameters and make function calls – marshalling as part of remote procedure calls SOAP + WSDL Why? Easy to be forwards-compatible Easy to read over and validate (?) Generally firewall-compatible Drawbacks? XML is a verbose and inefficient encoding! But if the calls are only sending a few 100s of bytes, who cares?
28
28 XML When Tags Are Used by Different Sources Namespaces allow us to specify a context for different tags Two parts: Binding of namespace to URI Qualified names http://www.fictitious.com/mypath is in default namespace this a different tag
29
29 XML Isn’t Enough on Its Own It’s too unconstrained for many cases! How will we know when we’re getting garbage? How will we query? How will we understand what we got?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.