Web Technology and DBMS’s (CB 29) CPSC 356 Database Design Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2005)
What is the Internet? Interconnected Networks History Interoperable via standards (TCP, IP) History ARPANet (late 60’s, early 70’s) - military NSF took over, provided funding (1986) Commercial backbones (1996 - present)
Internet vs. Intranet Internet Intranet Typically, limited interaction Global, open, public Called “extranet” to distinguish from Intranet Intranet Private network for exclusive use of an organization Behind a firewall Typically, limited interaction
Internet Evolution of a Business (Cisco) E-mail Web site E-Commerce Customers can place and pay for orders via web E-Business Complete integration of Internet technology into business’s infrastructure Ecosystem Entire business process automated - customers, suppliers, alliance partners, corporate infrastructure merged into seamless system
Web vs. Internet Internet provides infrastructure of information transmission across the network Web contains interlinked information (web pages) Servers: contain pages and transmit on request Clients (browsers): request and display pages to users
Languages of the Web HTTP - protocol for sending / receiving web pages. Includes GET vs. POST HTML - basic markup language (represents complex pages as text). Includes mechanism (href) to reference other pages. XML - extended markup language - generalization of HTML to represent more complex document information (etc) -- this is not a complete list!
Limitations of HTTP Each transaction consists of a single request for data and a single response. GET = request data POST = response (or data from form) Protocol is “stateless” - does not retain information beteween transactions Workarounds: cookies or hidden text
Static vs. Dynamic Web Page Static web page Contents of page stored in a file, served as originally written Dynamic web page Generated as needed Only instructions for generating the page are stored (e.g. php query/display based on argument)
Dynamic Web Pages Input provided as parameters or from forms in the page Data generated from queries Pages written in a “scripting language” (PHP is one example)
Requirements for Web-DBMS Integration Ability to maintain data security Not tied to a single DBMS Not tied to a single browser Availability of all features of DBMS Open-architecture approach, supporting multiple servers, standard interoperability languages (e.g. DCOM, CORBA, XML, SOAP)
Advantages of Web Approach Simplicity Platform independence Graphical User Interface Standardization Transparent network access Simplified deployment Global access
Disadvantages (lack of) Reliability (lack of) Security Cost (!) Heavy (unpredictable) loads HTML limitations Bandwidth & Performance limitations Immaturity of platform & tools
Tools to Integrate the Web Scripting Languages Client-side or Server-side Common Gateway Interface (CGI) Server-side Web server extensions Java, J2EE, SQLJ, etc. Microsoft .Net Oracle’s Internet platform
Scripting languages Programs embedded in HTML files on server Behavior of all clients changed by changing script on server Scripts are generally interpreted, not compiled
JavaScript Server-side or Client-side (client-side more common) Allows immediate reaction to user interface widgets Object-based (Document Object Model) Limited relative to Java (VBScript : Visual Basic = JavaScript : Java)
Perl, PHP Server side languages Heavy Unix flavor (e.g. use of regular expressions) PHP preferred for Linux Integrates with Apache HTTP Server Works well with MySQL or PostgreSQL
Common Gateway Interface (CGI) Defines how scripts communicate with web servers Uses Environment variables, standard I/O on server side Uses HTML Forms on client side (original impetus for forms) Nearly any programming language can run on server (std. output returned to browser) Server runs script based on filename (x.cgi, x.php, etc) - this is configurable
CGI Environment
Advantages of CGI Standardized interface Programming-language independent Web-server independence Simplicity Wide acceptance
Disadvantages of CGI Web server can be bottleneck Problems from HTTP Statelessness No validation of user input (must go “back” to repopulate form) No transactions (must connect to database on every page) Security risks Scripts with greater privileges than server Risk of unanticipated (or malicious) user input
HTTP Cookies Stored by browser in a file on client Cookies set / retrieved by HTTP request Issues: Not supported by all browsers For privacy reasons, users disable cookies
Non-CGI Gateways API Extensions to server (NSAPI, ISAPI) Scripts loaded as part of server Direct interaction with back end One copy per server, not per request Changes server executable (risk?) Requires specialized programmers
Vendor Specific Scripting Developed by vendors as server add-ons JSP - Java Server pages (Sun) ASP - Active Server pages (Microsoft) Flexibility of CGI, without performance overhead and I/O limitations Better integration with vendor’s other web solutions Some compatibility limitations
Sun’s Java Platform Java language: “write once run anywhere” Assorted “editions” ME (embedded) SE (standard) - for desktop & workstation EE (enterprise) - for multi-tier, robust, multiuser applications
Java 2 Platform (1999)
Microsoft .NET Platform Integrated across servers, OS, applications, programming language environments OLE - Object Linking and Embedding Shared objects that provide specific functionality (e.g. sound clip) [D]COM([Distributed] Component Object Model) Connects client application and object (binary-compatible components)
MS OLE - DB
.NET Framework
Oracle’s Approach Aimed at distributed environments N-tier architecture, based on industry standards Designed for e-business Components include: TopLink (relate Java objects to relational databases) Portal (entry point for web access) Business Intelligence (Reports, Discoverer, etc)
Oracle Internet Platform