CS193H: High Performance Web Sites Lecture 8: Rule 4 – Gzip Components

Slides:



Advertisements
Similar presentations
High Performance Web Sites 14 rules for faster pages
Advertisements

Web 2.0 Programming 1 © Tongji University, Computer Science and Technology. Web Web Programming Technology 2012.
Optimizing Websites with YSlow Tom Lianza Co-Founder Wishlisting.com Tom Lianza Co-Founder Wishlisting.com.
HTTP and Apache Roy T. Fielding eBuilt, Inc. The Apache Software Foundation
CS193H: High Performance Web Sites Lecture 9: Rule 5 – Put Stylesheets at the Top Steve Souders Google
CS193H: High Performance Web Sites Lecture 7: Add an Expires Header Steve Souders Google
High Performance Web Sites Essential Knowledge for Frontend Engineers
CS193H: High Performance Web Sites Lecture 16: Rule 13 – Configure ETags Steve Souders Google
CS193H: High Performance Web Sites Lecture 4: Class Projects Steve Souders Google
CS193H: High Performance Web Sites Lecture 13: Rule 10 – Minify JavaScript Steve Souders Google
CS193H: High Performance Web Sites Lecture 22: Vol 2 – Optimize Images, Use Iframes Sparingly, Flush the Document Early Steve Souders Google
CS193H: High Performance Web Sites Lecture 1: Introduction Steve Souders Google
CS193H: High Performance Web Sites Lecture 17: Rule 14 – Make Ajax Cacheable Steve Souders Google
CS193H: High Performance Web Sites Lecture 21: Vol 2 – Split Dominant Domains Steve Souders Google
High Performance Web Sites Essential Knowledge for Frontend Engineers
CS193H: High Performance Web Sites Lecture 15: Rule 12 – Remove Duplicate Scripts Steve Souders Google
CS193H: High Performance Web Sites Lecture 23: Vol 2 – Make static content cookie- free, Reduce cookie weight, To WWW or not to WWW Steve Souders Google.
Exceptional Performance Evolution at Yahoo! Steve Souders Chief Performance Yahoo!
CS193H: High Performance Web Sites Lecture 3: HTTP and the Web 100 Performance Profile Steve Souders Google
High Performance Web Sites 14 rules for faster-loading pages Steve Souders Tenni Theurer
CS193H: High Performance Web Sites Lecture 6: Use a CDN Steve Souders Google
CS193H: High Performance Web Sites Lecture 20: Vol 2 – Don't Scatter Inline Scripts Steve Souders Google
Steve Souders Even Faster Web Sites best practices for faster pages Disclaimer: This content does not necessarily.
CS193H: High Performance Web Sites Lecture 5: Make Fewer HTTP Requests Steve Souders Google
CS193H: High Performance Web Sites Lecture 12: Rule 8 – Make JavaScript and CSS External Steve Souders Google
CS193H: High Performance Web Sites Lecture 19: Vol 2 – Load Scripts Without Blocking Steve Souders Google
CS193H: High Performance Web Sites Lecture 10: Rule 6 – Put Scripts at the Bottom Steve Souders Google
CS193H: High Performance Web Sites Lecture 2: The Importance of Frontend Performance Steve Souders Google
High Performance Web Sites 14 rules for faster pages Steve Souders Tenni Theurer
CS193H: High Performance Web Sites Lecture 11: Rule 7 – Avoid CSS Expressions Rule 9 – Reduce DNS Lookups Steve Souders Google
today's class morning: afternoon: how we got here HTTP overview
CS193H: High Performance Web Sites Lecture 18: Vol 2 – Split the Initial Payload Steve Souders Google
Web Performance Meetup 1 Web Performance 101 Jeremy
Web Performance Meetup 1 Web Performance Toolbelt Jeremy
The World Wide Web and the Internet MIS XLM.B Jack G. Zheng June 20 th 2005.
The World Wide Web and the Internet MIS XLM.B Jack G. Zheng May 13 th 2008.
Nick Feamster CS 3251: Computer Networking I Spring 2013
Introduction to HTML, XHTML, and CSS
An Introduction to the Internet and the Web Frank McCown COMP 250 – Internet Development Harding University Photos were obtained from the Web, and copyright.
Representational State Transfer (REST): Representing Information in Web 2.0 Applications this is the presentation Emilio F Zegarra CS 2650.
Social Web Design 1 Darby Chang Social Web Design.
Go Live! Launching your MOSS Publishing site DEV435 Spencer Harbar.
CS193H: High Performance Web Sites Lecture 24: Vol 2 – CSS Descendant Selectors, Forced Compression Steve Souders Google
HTTP Reading: Section and COS 461: Computer Networks Spring
The OWASP Foundation Web Application Security Host Apps Firewall Host Apps Database Host Web serverApp serverDB server Securing the.
CS 22: Enhanced Web Site Design - Week 8Slide 1 of 15 Enhanced Web Site Design Stanford University Continuing Studies CS 22 Mark Branom
Fawaz Ghali AJAX: Web Programming's Toy.
Presenter: James Huang Date: Sept. 29,  HTTP and WWW  Bottle Web Framework  Request Routing  Sending Static Files  Handling HTML  HTTP Errors.
Using the Memento MediaWiki Extension to Avoid Spoilers Shawn M. Jones Old Dominion University.
Troubleshooting using HTTP Headers
1 What is the Internet Archive We are a Digital Library Mission Statement: Universal access to human knowledge Founded in 1996 by Brewster Kahle in San.
PACS – 06/21/14 1 Cache? What is caching? A way to increase the average rate of a process by preferentially using a copy of data in a faster, closer, probably.
An Introduction to the Internet and the Web Frank McCown COMP 250 – Internet Development Harding University.
HTTP HyperText Transfer Protocol. HTTP Uses TCP as its underlying transport protocol Uses port 80 Stateless protocol (i.e. HTTP Server maintains no information.
High Performance Websites (Based on Steve Souder’s lecture) By Bhoomi Patel.
Mark Phillip markphillip.com 200s, 304s, Expires Headers, HTTP Compression, And You.
LURP Details. LURP Lab Details  1.Given a GET … call a proxy CGI script in the same way you would for a normal CGI request  2.This UDP perl.
Block 5: An application layer protocol: HTTP
HTTP – An overview.
The Hypertext Transfer Protocol
How does it work ?.
Debugging Your Website with Fiddler and Chrome Developer Tools
Web Caching? Web Caching:.
HTTP Protocol.
HTTP Request Method URL Protocol Version GET /index.html HTTP/1.1
CS3220 Web and Internet Programming Cookies and Session Tracking
Old Dominion University Department of Computer Science

CS3220 Web and Internet Programming Cookies and Session Tracking
CSCI-351 Data communication and Networks
Presentation transcript:

CS193H: High Performance Web Sites Lecture 8: Rule 4 – Gzip Components Steve Souders Google souders@cs.stanford.edu

Announcements Web 100 Performance Profile (round 1) class project has been graded – contact Aravind if you want to know your grade

Compression (encoding) GET /v-app/scripts/107652916-dom.common.js HTTP/1.1 Host: www.blogger.com User-Agent: Mozilla/5.0 (…) Gecko/2008070208 Firefox/3.0.1 Accept-Encoding: gzip,deflate GET /v-app/scripts/107652916-dom.common.js HTTP/1.1 Host: www.blogger.com User-Agent: Mozilla/5.0 (…) Gecko/2008070208 Firefox/3.0.1 HTTP/1.1 200 OK Content-Type: application/x-javascript Last-Modified: Mon, 22 Sep 2008 21:14:35 GMT Content-Length: 6230 function d(s) {... HTTP/1.1 200 OK Content-Type: application/x-javascript Last-Modified: Mon, 22 Sep 2008 21:14:35 GMT Content-Length: 2066 Content-Encoding: gzip XmoÛHþ\ÿFÖvã*wØoq... typically reduces size by 70% (6230-2066)/6230 = 67%

Gzip vs. Deflate gzip (default settings) compresses more Gzip Deflate Size Savings Script 3.3K 1.1K 67% 66% 39.7K 14.5K 64% 16.6K 58% Stylesheet 1.0K 0.4K 56% 0.5K 52% 14.1K 3.7K 73% 4.7K gzip (default settings) compresses more

Pros and Cons Pro: smaller transfer size Con: CPU cycles – on client and server Don't compress resources < 1K

Gzip configuration Apache 1.3: mod_gzip Apache 2.x: mod_deflate mod_gzip_item_include file \.html$ mod_gzip_item_include mime ^text/html$ mod_gzip_item_include file \.js$ mod_gzip_item_include mime ^application/x-javascript$ mod_gzip_item_include file \.css$ mod_gzip_item_include mime ^text/css$ Apache 2.x: mod_deflate AddOutputFilterByType DEFLATE text/html text/css application/x-javascript control compression level: DeflateCompressionLevel http://httpd.apache.org/docs/2.0/mod/mod_deflate.html

Gzip: not just for HTML gzip scripts, stylesheets, XML, JSON amazon.com x aol.com some cnn.com ebay.com froogle.google.com msn.com deflate myspace.com wikipedia.org yahoo.com youtube.com HTML Scripts Stylesheets aol.com x ebay.com some facebook.com google.com/search na search.live.com/results msn.com myspace.com en.wikipedia.org/wiki yahoo.com youtube.com Images and PDF files are already compressed. Gzipping them wastes CPU and can increase file sizes. gzip scripts, stylesheets, XML, JSON (not images, Flash, PDF) October 2008 March 2007

Edge Case: Proxies Proxy Origin Server 1 GET main.js Accept-Encoding: gzip 2 GET main.js Accept-Encoding: gzip 5 main.js Content-Encoding: gzip 3 main.js Content-Encoding: gzip 6 GET main.js (no Accept-Encoding) 7 main.js Content-Encoding: gzip 4 main.js Content-Encoding: gzip proxies may serve gzipped content to browsers that don't support it, and vice versa

Edge Case: Proxies w/ Vary Proxy Origin Server 1 GET main.js Accept-Encoding: gzip 2 GET main.js Accept-Encoding: gzip 7 GET main.js (no Accept-Encoding) 5 main.js Content-Encoding: gzip 3 main.js Content-Encoding: gzip Vary: Accept-Encoding 6 GET main.js (no Accept-Encoding) 8 main.js Vary: Accept-Encoding 10 main.js (no gzip) 4 main.js Content-Encoding: gzip [Accept-Encoding: gzip] 11 GET main.js Accept-Encoding: gzip 12 main.js Content-Encoding: gzip 9 main.js [Accept-Encoding: ] 13 GET main.js (no Accept-Encoding) 14 main.js (no gzip) add Vary: Accept-Encoding

Edge Case: Bad Browsers < 1% of browsers have problems with gzip IE 5.5: http://support.microsoft.com/default.aspx?scid=kb;en-us;Q313712 IE 6.0: http://support.microsoft.com/default.aspx?scid=kb;en-us;Q31249 Netscape 3.x, 4.x http://www.schroepl.net/projekte/mod_gzip/browser.htm User-Agent white list for gzip Apache 1.3: mod_gzip_item_include reqheader "User-Agent: MSIE [6-9]" mod_gzip_item_include reqheader "User-Agent: Mozilla/[5-9]" Apache 2.0: BrowserMatch ^MSIE [6-9] gzip BrowserMatch ^Mozilla/[5-9] gzip

Edge Case: Bad Browsers (cont'd) proxies could mix-up responses give cached response from useragent1 to useragent2 could add Vary: User-Agent so many possibilities, defeats proxy caching better to add Cache-Control: Private downside: disables all proxy caches is it a serious problem? hard to diagnose; problem getting smaller

Edge Case: ETags what happens when proxy makes Conditional GET requests? Last-Modified date for gzipped vs. ungzipped is different => If-Modified-Since works fine ETag is the same in Apache for gzipped & ungzipped => If-None-Match succeeds, proxy could give browser mismatched content remove Etags! (Rule 13) http://issues.apache.org/bugzilla/show_bug.cgi?id=39727

Edge Case: ETags present Proxy Origin Server 1 GET main.js Accept-Encoding: gzip 2 GET main.js Accept-Encoding: gzip 7 GET main.js If-None-Match: "de158-e58-c7ee4140" 5 main.js Content-Encoding: gzip 3 main.js Content-Encoding: gzip Cache-Control: max-age=0 ETag: "de158-e58-c7ee4140" 6 GET main.js (no Accept-Encoding) 8 304 Not Modified 9 main.js Content-Encoding: gzip 4 main.js Content-Encoding: gzip Cache-Control: max-age=0 ETag: "de158-e58-c7ee4140" proxy gives browser mismatched content

Edge Case: ETags removed Proxy Origin Server 1 GET main.js Accept-Encoding: gzip 2 GET main.js Accept-Encoding: gzip 7 GET main.js If-Modified-Since: Thu, 21 Aug 2008 23:53:57 GMT 5 main.js Content-Encoding: gzip 3 main.js Content-Encoding: gzip Cache-Control: max-age=0 Last-Modified: Thu, 21 Aug 2008 23:53:57 GMT 6 GET main.js (no Accept-Encoding) 8 main.js Cache-Control: max-age=0 Last-Modified: Fri, 22 Aug 2008 09:43:15 GMT 10 main.js (no gzip) 4 main.js Content-Encoding: gzip Cache-Control: max-age=0 Last-Modified: Thu, 21 Aug 2008 23:53:57 GMT 9 main.js Cache-Control: max-age=0 Last-Modified: Fri, 22 Aug 2008 09:43:15 GMT removing ETags avoids the problem

Vary: Accept-Encoding Cache-Control: private Edge Case Fixes Vary: Accept-Encoding Cache-Control: private ETag aol.com x ebay.com x (IIS) facebook.com google.com/search search.live.com/results msn.com myspace.com x (Apa) en.wikipedia.org/wiki yahoo.com youtube.com some Images and PDF files are already compressed. Gzipping them wastes CPU and can increase file sizes. Vary: User-Agent – not used October 2008 March 2007

Homework "Improving Top Site" class project: add improvements for Rule 4 measure improvements using Hammerhead record results in your personal Web 100 sheet read Chapter 5 of HPWS for 10/17

Questions How much are file sizes typically reduced by using gzip compression? What types of resources (images, scripts, etc.) should not be compressed? For the resource types that should be compressed, should they always be compressed? How do you prevent proxies from serving gzipped resources to browsers that don't support gzip? How can ETags cause proxies to serve mismatched content to browsers?