COMP 150-IDS: Internet Scale Distributed Systems (Spring 2017)

Slides:



Advertisements
Similar presentations
REST Vs. SOAP.
Advertisements

HTTP Hypertext Transfer Protocol. HTTP messages HTTP is the language that web clients and web servers use to talk to each other –HTTP is largely “under.
Chapter 2 Application Layer Computer Networking: A Top Down Approach Featuring the Internet, 3 rd edition. Jim Kurose, Keith Ross Addison-Wesley, July.
HTTP Overview Vijayan Sugumaran School of Business Administration Oakland University.
1 The World Wide Web. 2  Web Fundamentals  Pages are defined by the Hypertext Markup Language (HTML) and contain text, graphics, audio, video and software.
2/9/2004 Web and HTTP February 9, /9/2004 Assignments Due – Reading and Warmup Work on Message of the Day.
HTTP By Mychal Hess, Dee Chow, and Riley Barnes. History HTTP  Tim Berners-Lee he implemented the HTTP protocol in 1990 at the European Center for High-
HTTP & REST Noah Mendelsohn Tufts University Web: COMP 150-IDS: Internet Scale.
Web Architecture Dr. Frank McCown Intro to Web Science Harding University This work is licensed under a Creative Commons Attribution-NonCommercial- ShareAlike.
CP476 Internet Computing Lecture 5 : HTTP, WWW and URL 1 Lecture 5. WWW, HTTP and URL Objective: to review the concepts of WWW to understand how HTTP works.
2: Application Layer1 CS 4244: Internet Software Development Dr. Eli Tilevich.
Wyatt Pearsall November  HyperText Transfer Protocol.
Copyright 2012 & 2015 – Noah Mendelsohn Introduction to: The Architecture of the World Wide Web Noah Mendelsohn Tufts University
1 Seminar on Service Oriented Architecture Principles of REST.
World Wide Web “WWW”, "Web" or "W3". World Wide Web “WWW”, "Web" or "W3"
2007cs Servers on the Web. The World-Wide Web 2007 cs CSS JS HTML Server Browser JS CSS HTML Transfer of resources using HTTP.
IS-907 Java EE World Wide Web - Overview. World Wide Web - History Tim Berners-Lee, CERN, 1990 Enable researchers to share information: Remote Access.
Web Technologies Lecture 1 The Internet and HTTP.
Internet Applications (Cont’d) Basic Internet Applications – World Wide Web (WWW) Browser Architecture Static Documents Dynamic Documents Active Documents.
Overview of Servlets and JSP
COMP2322 Lab 2 HTTP Steven Lee Jan. 29, HTTP Hypertext Transfer Protocol Web’s application layer protocol Client/server model – Client (browser):
27.1 Chapter 27 WWW and HTTP Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
1 Chapter 22 World Wide Web (HTTP) Chapter 22 World Wide Web (HTTP) Mi-Jung Choi Dept. of Computer Science and Engineering
Simple Web Services. Internet Basics The Internet is based on a communication protocol named TCP (Transmission Control Protocol) TCP allows programs running.
REST API Design. Application API API = Application Programming Interface APIs expose functionality of an application or service that exists independently.
Intro to REST Joe Gregorio Google. REST is an Architectural Style.
What’s Really Happening
National College of Science & Information Technology.
From Web Documents to Web Applications
Web fundamentals: Clients, Servers, and Communication
Block 5: An application layer protocol: HTTP
How HTTP Works Made by Manish Kushwaha.
WWW and HTTP King Fahd University of Petroleum & Minerals
HTTP – An overview.
The Hypertext Transfer Protocol
JavaScript and Ajax (Internet Background)
Web Development Web Servers.
How does it work ?.
Introduction to: The Architecture of the World Wide Web
REST- Representational State Transfer Enn Õunapuu
CNIT 131 Internet Basics & Beginning HTML
COMP2322 Lab 2 HTTP Steven Lee Feb. 8, 2017.
Advanced Web-based Systems | Misbhauddin
Hypertext Transport Protocol
Internet transport protocols services
Introduction Web Environments
Representational State Transfer
Introduction to: The Architecture of the World Wide Web
WEB API.
What is Cookie? Cookie is small information stored in text file on user’s hard drive by web server. This information is later used by web browser to retrieve.
Chapter 27 WWW and HTTP.
Hypertext Transfer Protocol
Versioning, Extensibility & Postel’s Law
Introduction to: The Architecture of the World Wide Web
Hypertext Transfer Protocol
From Web Documents to Web Applications
COMP 150-IDS: Internet Scale Distributed Systems (Spring 2016)
$, $$, $$$ API testing Edition
EE 122: HyperText Transfer Protocol (HTTP)
REST APIs Maxwell Furman Department of MIS Fox School of Business
Hypertext Transfer Protocol
Kevin Harville Source: Webmaster in a Nutshell, O'Rielly Books
Introduction to World Wide Web
CIS 133 mashup Javascript, jQuery and XML
WEB SERVICES From Chapter 19, Distributed Systems
Traditional Internet Applications
HTTP Hypertext Transfer Protocol
Hypertext Transfer Protocol
CSCI-351 Data communication and Networks
Chengyu Sun California State University, Los Angeles
Presentation transcript:

COMP 150-IDS: Internet Scale Distributed Systems (Spring 2017) HTTP & REST Noah Mendelsohn Tufts University Email: noah@cs.tufts.edu Web: http://www.cs.tufts.edu/~noah Copyright 2012, 2103, 2015 & 2017

Architecting a universal Web Identification: URIs Interaction: HTTP Data formats: HTML, JPEG, GIF, etc.

Goals Learn why traditional architectures would not have worked for the Web Learn the basics of HTTP (review) Explore some interesting characteristics of HTTP Learn the REST distributed computing architecture

Question: What would happen if we built the Web from RPC?

Why not use RPC for the Web? No uniform interface You’d need an interface definition to follow a link No global naming Most RPC data types poorly tuned for documents Int, float, struct not HTML, XML, mixed-content, etc. We’ll revisit the RPC/Web question later and add some more

HTTP in Action (review from first week)

What Happens When We Browse a Web Page?

The user clicks on a link URI is http://webarch.noahdemo.com/demo1/test.html URI is http://webarch.noahdemo.com/demo1/test.html

The http “scheme” tells client to send HTTP GET msg URI is http://webarch.noahdemo.com/demo1/test.html URI is http://webarch.noahdemo.com/demo1/test.html HTTP GET

The server is identified by DNS name in the URI URI is http://webarch.noahdemo.com/demo1/test.html URI is http://webarch.noahdemo.com/demo1/test.html HTTP GET Host: webarch.noahdemo.com

The client sends an HTTP GET URI is http://webarch.noahdemo.com/demo1/test.html HTTP GET demo1/test.html GET /demo1/test.html HTTP/1.0 Host: webarch.noahdemo.com User-Agent: Noahs Demo HttpClient v1.0 Accept: */* Accept-language: en-us Host: webarch.noahdemo.com

The client sends an HTTP GET URI is http://webarch.noahdemo.com/demo1/test.html HTTP GET …but it can carry binary entity bodies such as image/jpeg Note that HTTP is a text-based protocol demo1/test.html GET /demo1/test.html HTTP/1.0 Host: webarch.noahdemo.com User-Agent: Noahs Demo HttpClient v1.0 Accept: */* Accept-language: en-us Host: webarch.noahdemo.com

The server sends an HTTP Response HTTP/1.1 200 OK Date: Tue, 28 Aug 2007 01:49:33 GMT Server: Apache Transfer-Encoding: chunked Content-Type: text/html <html> <head> <title>Demo #1</title> </head> <body> <h1>A very simple Web page</h1> </body> </html> The server sends an HTTP Response HTTP GET HTTP Status Code 200 Means Success! demo1/test.html Host: webarch.noahdemo.com HTTP RESPONSE

The server sends an HTTP Response HTTP/1.1 200 OK Date: Tue, 28 Aug 2007 01:49:33 GMT Server: Apache Transfer-Encoding: chunked Content-Type: text/html <html> <head> <title>Demo #1</title> </head> <body> <h1>A very simple Web page</h1> </body> </html> The server sends an HTTP Response HTTP GET demo1/test.html The “representation” returned is an HTML document Host: webarch.noahdemo.com HTTP RESPONSE

The server sends an HTTP Response HTTP/1.1 200 OK Date: Tue, 28 Aug 2007 01:49:33 GMT Server: Apache Transfer-Encoding: chunked Content-Type: text/html <html> <head> <title>Demo #1</title> </head> <body> <h1>A very simple Web page</h1> </body> </html> The server sends an HTTP Response HTTP GET demo1/test.html Host: webarch.noahdemo.com The HTML for the page. HTTP RESPONSE

HTTP Methods, Headers & Extensibility

HTTP Requests always have a method URI is http://webarch.noahdemo.com/demo1/test.html HTTP REQUEST Host: webarch.noahdemo.com GET GET /demo1.html HTTP/1.0 Host: webarch.noahdemo.com User-Agent: Noahs Demo HttpClient v1.0 Accept: */* Accept-language: en-us

HTTP Headers GET /demo1.html HTTP/1.0 Host: webarch.noahdemo.com User-Agent: Noahs Demo HttpClient v1.0 Accept: */* Accept-language: en-us Request headers HTTP/1.1 200 OK Date: Tue, 28 Aug 2007 01:49:33 GMT Server: Apache Transfer-Encoding: chunked Content-Type: text/html <!DOCTYPE html> <html> <head> <title>Demo #1</title> </head> <body> <h1>A very simple Web page</h1> </body> </html> Response headers

HTTP Extension Points HTTP Technologies used by HTTP New headers (quite common) New methods (rare) New versions of HTTP itself (e.g. HTTP 1.1) (very rare) New status codes (very rare) Technologies used by HTTP Media types (text/html, image/jpec, application/…something new?... Languages (e.g. en-us) Transfer encodings (e.g. gzip)

HTTP Protocol for a Discoverable Web

HTTP: the protocol for a discoverable Web HTTP provides a uniform interface to all resources You don’t have to know in advance what a resource is like to access it with HTTP…so, you or your software can explore the Web in ad-hoc ways Self-describing: when a response comes back, you can figure out what it means Using HTTP to access a resource doesn’t damage the resource For all this to work, you must use HTTP…and you must use it properly!

HTTP Requests always have a method URI is http://webarch.noahdemo.com/demo1/test.html HTTP REQUEST Host: webarch.noahdemo.com GET GET /demo1.html HTTP/1.0 Host: webarch.noahdemo.com User-Agent: Noahs Demo HttpClient v1.0 Accept: */* Accept-language: en-us

Method Meaning GET POST HEAD PUT DELETE HTTP Methods GET is very carefully defined to make the Web scalable and discoverable. HTTP Methods Method Meaning GET Retrieve a representation of the resource but do not change anything! POST Update the resource or a dependent resource base on supplied representation HEAD Optimized GET – retrieves only headers, used for checking caches, etc. PUT Create or completely replace the resource base on supplied representation DELETE Delete the resource identified by the URI

Method Meaning GET POST HEAD PUT DELETE HTTP Methods GET is very carefully defined to make the Web scalable and discoverable. HTTP Methods Method Meaning GET Retrieve a representation of the resource but do not change anything! POST Update the resource or a dependent resource base on supplied representation HEAD Optimized GET – retrieves only headers, used for checking caches, etc. PUT Create or completely replace the resource base on supplied representation DELETE Delete the resource identified by the URI

HTTP GET Has A Very Particular Meaning URI is http://webarch.noahdemo.com/demo1/test.html HTTP GET It’s always safe to try a GET on any URI. In fact, that’s why search engines like Google can safely crawl the Web! GET /demo1.html HTTP/1.0 Host: webarch.noahdemo.com User-Agent: Noahs Demo HttpClient v1.0 Accept: */* Accept-language: en-us Host: webarch.noahdemo.com GET

HTTP GET Has A Very Particular Meaning It’s always safe to try a GET on any URI. In fact, that’s why search engines like Google can safely crawl the Web! But GET is so convenient that some people use it in the wrong places…if you’re changing the state of a resource you must use PUT or POST!

Demonstration #3: http://webarch.noahdemo.com/MisuseGet/

Using HTTP Properly If GET is used instead of POST, your applications may update accidentally If too many applications misuse GET, then it will become impossible to explore the web If you click a link, you might break something..you would have to ask about each link before clicking! Search crawlers couldn’t be used…there would be no Google! Read TAG Finding “URIs, Addressability, and the use of HTTP GET and POST” http://www.w3.org/2001/tag/doc/whenToUseGet.html

Web Interaction: Highlights HTTP: a uniform way to interact with resources Promotes interoperability & discoverablility Use HTTP properly: GET is special! HTTP has features like content negotiation and headers that let you adapt content for your users

The Self-Describing Web

The Self-Describing Web Everything you need to understand a response is in the response… …to find the instructions, read RFC 3989… …and follow your nose!

“The Self-Describing Web” Follow your nose The Web community’s term for: Going step by step through a Web response and the pertinent specifications… …to figure out what the response means All the specifications you need are found (transitively) from RFC 3986! Thus there is a quite rigorous sense in which the Web is browsable and explorable Read TAG Finding “The Self-Describing Web” http://www.w3.org/2001/tag/doc/selfDescribingDocuments

The server sends an HTTP Response HTTP/1.1 200 OK Date: Tue, 28 Aug 2007 01:49:33 GMT Server: Apache Transfer-Encoding: chunked Content-Type: text/html <html> <head> <title>Demo #1</title> </head> <body> <h1>A very simple Web page</h1> </body> </html> The server sends an HTTP Response HTTP GET demo1/test.html The “representation” returned is an HTML document Host: webarch.noahdemo.com HTTP RESPONSE

Stateful and Stateless Protocols

Stateful and Stateless Protocols Stateful: server knows which step (state) has been reached Stateless: Client remembers the state, sends to server each time Server processes each request independently Can vary with level Many systems like Web run stateless protocols (e.g. HTTP) over streams…at the packet level, TCP streams are stateful HTTP itself is mostly stateless, but many HTTP requests (typically POSTs) update persistent state at the server

Advantages of stateless protocols Protocol usually simpler Server processes each request independently Load balancing and restart easier Typically easier to scale and make fault-tolerant Visibility: individual requests more self-describing

Advantages of stateful protocols Individual messages carry less data Server does not have to re-establish context each time There’s usually some changing state at the server at some level, except for completely static publishing systems

HTTP is Stateless (usually) HTTP is fundamentally stateless…response determined entirely from GET request & URI (unless resource itself changes!!) In practice: Cookies stored in browser tied to session state and login status OK: determine access rights from cookie NOT OK: return different content based on cookie WHY?: On the Web, we identify with URIs, not cookies! Of course, the state of resources does change: E.g. when we add something to a shopping cart

REST

REST – REpresenational State Transfer What is REST? A distributed system architecture In practice: implemented using HTTP Key features States transferred using representations (HTTP requests/responses) Server is stateless URIs (and request data if provided) convey state to application History Described by Roy Fielding in his PhD thesis Came after Tim BL invented the Web, but… …Roy was very influential in development of formal specifications of and details of HTTP and URIs Fielding’s PhD Thesis: http://www.ics.uci.edu/~fielding/pubs/dissertation/fielding_dissertation.pdf

Building a REST application All data communicated using HTTP GET / POST Unit of interaction is document Formats like JSON, XML, RDF, N3 used to convey data State of navigation typically captured in URIs: URI to start application URI for 2nd step URI for 3rd step Captures client parameters & input in URIs (e.g. search parms, flight numbers) States of a REST application are linkable, bookmarkable Examples: http://webarch.noahdemo.com/MisuseGet/ Google search Google maps (AJAX example: uses REST recursively)

A Complex REST Application Google Maps Warning: some details of Google’s implementation may have changed since this presentation was prepared

AJAX Application – Google Maps Images retrieved in segments using ordinary Web HTTP Requests

AJAX Application – Google Maps JavaScript at client tracks mouse and moves images for smooth panning… asynchronously requests new image tiles in background

AJAX Application – Google Maps The Web is used to retrieve an ordinary file listing points of interest…. (used to be XML but could use JSON, other?) <?xml version="1.0" ?> <page> <title>hotels in hawthorne</title> <query>pizza in atlanta</query> <center lat="33.748888" lng="-84.388056" /> <info> <title>Wellesley Inn</title> <address> <line>540 Saw Mill River Rd.</line> <line>Elmsford, NY 10523</line> </address> </page>

AJAX Application – Google Maps …Data used to drive formatting of points of interest <?xml version="1.0" ?> <page> <title>hotels in hawthorne</title> <query>pizza in atlanta</query> <center lat="33.748888" lng="-84.388056" /> <info> <title>Wellesley Inn</title> <address> <line>540 Saw Mill River Rd.</line> <line>Elmsford, NY 10523</line> </address> </page>

AJAX Application – Google Maps …and XSLT in the browser converts that to HTML <?xml version="1.0" ?> <page> <title>hotels in hawthorne</title> <query>pizza in atlanta</query> <center lat="33.748888" lng="-84.388056" /> <info> <title>Wellesley Inn</title> <address> <line>540 Saw Mill River Rd.</line> <line>Elmsford, NY 10523</line> </address> </page>

AJAX Application – Google Maps

Question: What would happen if we built the Web from RPC? --- Revisited!

Why not use RPC for the Web? No uniform interface You’d need an interface definition to follow a link No global naming Most RPC data types poorly tuned for documents

Why not use RPC for the Web? No uniform interface You’d need an interface definition to follow a link No global naming Most RPC data types poorly tuned for documents No safe methods

Why not use RPC for the Web? No uniform interface You’d need an interface definition to follow a link No global naming Most RPC data types poorly tuned for documents No safe methods No content negotiation

HTTP Content Negotiation

Content negotiation for language URI is http://webarch.noahdemo.com/demo1/test.html HTTP GET I prefer English, please! Host: webarch.noahdemo.com GET /demo1/test.html HTTP/1.0 Host: webarch.noahdemo.com User-Agent: Noahs Demo HttpClient v1.0 Accept: */* Accept-language: en-us Accept-language: en-us But: it is OK to also provide individual URIs for each language version

Content negotiation for device URI is http://webarch.noahdemo.com/demo1/test.html HTTP GET GET /demo1/test.html HTTP/1.0 Host: webarch.noahdemo.com User-Agent: Noahs Demo HttpClient v1.0 Accept: */* Accept-language: en-us User-Agent: Mozilla/4.0 Firefox Host: webarch.noahdemo.com

Content negotiation for device URI is http://webarch.noahdemo.com/demo1/test.html HTTP GET Cell phone browser Host: webarch.noahdemo.com GET /demo1/test.html HTTP/1.0 Host: webarch.noahdemo.com User-Agent: Noahs Demo HttpClient v1.0 Accept: */* Accept-language: en-us User-Agent: Cell Phone Browser V1.1

Even phone numbers are on the Web <a href=“tel:18005551212”> Device independence The same URI works across devices… Even phone numbers are on the Web <a href=“tel:18005551212”> …you can see my reservation from your cell phone

Summary

Summary HTTP and the REST model are designed to meet the unique needs of the Web Document-oriented Discoverable Stateless Extensible Global scale Supports unique features like content negotiation