Download presentation
Presentation is loading. Please wait.
Published byCody Oliver Modified over 11 years ago
1
High Performance Web Sites 14 rules for faster pages
Steve Souders Chief Performance Yahoo!
2
Exceptional Performance
quantify and improve the performance of all Yahoo! products worldwide center of expertise build tools, analyze data gather, research, and evangelize best practices
3
Scope performance breaks into two categories
response time efficiency current focus is response time of web products coming up next: mobile, backend
4
The Importance of Frontend Performance
Backend = 5% Frontend = 95% Even primed cache, frontend = 88%
5
Time Spent on the Frontend
Empty Cache Primed Cache amazon.com 82% 86% aol.com 94% cnn.com 81% 92% ebay.com 98% google.com 64% msn.com 97% 95% myspace.com 96% wikipedia.org 80% 88% yahoo.com youtube.com
6
The Performance Golden Rule
80-90% of the end-user response time is spent on the frontend. Start there. Greater potential for improvement Simpler Proven to work If you could cut performance in half, FE changes would be 40-45%, while BE would be only 5-10%. BE changes are typically more complex: rearchitecture, optimize code, add/modify hw, distribute databases, etc. FE is simpler: change web server config, place scripts and stylesheets differently in the page, combine requests, etc. At Yahoo!, we've cut response times on 50 properties, often by 25% or more.
7
Agenda Performance Research 14 Rules (plus more) Case Studies
Evangelism Live Analysis
8
Performance Research If you could cut performance in half, FE changes would be 40-45%, while BE would be only 5-10%. BE changes are typically more complex: rearchitecture, optimize code, add/modify hw, distribute databases, etc. FE is simpler: change web server config, place scripts and stylesheets differently in the page, combine requests, etc. At Yahoo!, we've cut response times on 50 properties, often by 25% or more.
9
Browser Cache Experiment
Add an image to the page: Expires: Thu, 15 Apr :00:00 GMT Last-Modified: Wed, 28 Sep :49:57 GMT # users with at least one 200 response total # unique users Percentage of users with an empty cache? Percentage of page views with an empty cache? # of 200 responses total # responses
10
Browser Cache Expt Results
users with empty cache page views with empty cache 40-60% ~20% On the first day of the experiment, no one had these images cached so the empty cache percentage was 100%. As the days passed more users had the images cached, so the percentages dropped until at some point it reached a constant steady state.
11
Experiment Takeaways The empty cache user experience is more prevalent than you think! Optimize for both primed cache and empty cache experience. Strategies such as combining scripts, stylesheets, or images reduce the number of HTTP requests for both an empty and a full cache page view. Configuring components to have an Expires header with a date in the future reduces the number of HTTP requests for only the full cache page view.
12
Impact of Cookies on Response Time
Cookie Size Time Delta 0 bytes 78 ms 0 ms 500 bytes 79 ms +1 ms 1000 bytes 94 ms +16 ms 1500 bytes 109 ms +31 ms 2000 bytes 125 ms +47 ms 2500 bytes 141 ms +63 ms 3000 bytes 156 ms +78 ms The performance team at Yahoo! ran an experiment to measure the impact of retrieving a document with various cookie sizes. The experiment measured a static HTML document with no elements in the page. The primary variable in the experiment was the cookie size. We ran the experiment using a test harness that fetches a set of URLs repeatedly while measuring how long it takes to load the page on DSL. These results highlight the importance of keeping the size of cookies as low as possible to minimize the impact on the user’s response time. A 3000 byte cookie, or multiple cookies that total 3000 bytes, could add as much as an 80 ms delay for users on DSL bandwidth speeds. The delay is even worse for users on dial-up. keep sizes low 80 ms delay dialup users
13
Experiment Takeaways eliminate unnecessary cookies
keep cookie sizes low set cookies at appropriate domain level set Expires date appropriately earlier date or none removes cookie sooner Setting cookies at the appropriate path and domain is just as important as the size of the cookie, if not more. A cookie set at the .yahoo.com domain impacts the response time for every Yahoo! page in the .yahoo.com domain that a user visits. Eliminate unnecessary cookies. Keep cookie sizes as low as possible to minimize the impact on the user response time. Be mindful of setting cookies at the appropriate domain level so other sub-domains are not affected. Set an Expires date appropriately. An earlier Expires date or none removes the cookie sooner, improving the user response time.
14
Parallel Downloads Two in parallel Four in parallel Eight in parallel
If a web page evenly distributed its components across two hostnames, the overall response time would be about twice as fast. The HTTP requests would look as shown in Figure 2, with four components downloaded in parallel (two per hostname). The horizontal width of the box is the same, to give a visual cue as to how much faster this page loads. Limiting parallel downloads to two per hostname is a guideline. By default, both Internet Explorer and Firefox follow the guideline, but users can override this default behavior. Internet Explorer stores the value in the Registry Editor. (See Microsoft Help and Support.) Firefox’s setting is controlled by the network.http.max-persistent-connections-per-server setting, accessible in the about:config page. It’s interesting to note that for HTTP/1.0, Firefox’s default is to download eight components in parallel per hostname. Figure 3 shows what it would look like to download these ten images if Firefox’s HTTP/1.0 settings are used. It’s even faster than Figure 2, and we didn’t have to split the images across two hostnames. Instead of relying on users to modify their browser settings, front-end engineers could simply use CNAMEs (DNS aliases) to split their components across multiple hostnames.
15
Maximizing Parallel Downloads
response time (seconds) In our experiment, we vary the number of aliases: 1, 2, 4, 5, and 10. This increases the number of parallel downloads to 2, 4, 8, 10, and 20 respectively. We fetch 20 smaller-sized images (36 x 36 px) and 20 medium-sized images (116 x 61 px). To our surprise, increasing the number of aliases for loading the medium-size images (116 x 61px) worsens the response times using four or more aliases. Increasing the number of aliases by more than two for smaller-sized images (36 x 36px) doesn’t make much of an impact on the overall response time. aliases
16
Maximizing Parallel Downloads
response time (seconds) On average, using two aliases is best. One possible contributor for slower response times is the amount of CPU thrashing on the client caused by increasing the number of parallel downloads. The more images that are downloaded in parallel, the greater the amount of CPU thrashing on the client. On my laptop at work, the CPU jumped from 25% usage for 2 parallel downloads to 40% usage for 20 parallel downloads. These values can vary significantly across users’ computers but is just another factor to consider before increasing the number of aliases to maximize parallel downloads. These results are for the case where the domains are already cached in the browser. In the case where the domains are not cached, the response times get significantly worse as the number of hostname aliases increases. For web pages desiring to optimize the experience for first time users, we recommend not to increase the number of domains. To optimize for the second page view, where the domains are most likely cached, increasing parallel downloads does improve response times. The choice depends on which scenario was most typical. domain names caching. Another issue to consider is that DNS lookup times vary significantly across ISPs and geographic locations. Typically, DNS lookup times for users from non-US cities are significantly higher than those for users within the US. If a good percentage of your users are coming from outside the US, the benefits of increasing parallel downloads is offset by the time to make many DNS lookups. rule of thumb: use at least two but no more than four aliases aliases
17
Experiment Takeaways consider the effects of CPU thrashing
DNS lookup times vary across ISPs and geographic locations domain names may not be cached
18
14 Rules
19
14 Rules Make fewer HTTP requests Use a CDN Add an Expires header
Gzip components Put stylesheets at the top Move scripts to the bottom Avoid CSS expressions Make JS and CSS external Reduce DNS lookups Minify JS Avoid redirects Remove duplicate scripts Configure ETags Make AJAX cacheable in priority order addressing these rules improve response times
20
Rule 1: Make fewer HTTP requests
CSS sprites combined scripts, combined stylesheets preloading image maps inline images CSS sprites is the best soln, but I arranged these in order of complexity. Sometimes there's a tension or tradeoff between richer content and performance. One of the few rules that improves response times for first-time (empty cache) visitors (~20% of page views, 40-60% of unique users).
21
CSS Sprites size of combined image is less <span style="
background-image: url('sprites.gif'); background-position: -260px -90px;"> </span> Size of combined image actually decreases due to reduced image overhead (color tables, formatting info, etc.). size of combined image is less
22
Combined Scripts, Combined Stylesheets
amazon.com 3 1 aol.com 18 cnn.com 11 2 ebay.com 7 froogle.google.com msn.com 9 myspace.com wikipedia.org yahoo.com 4 youtube.com Average 6.5 1.5
23
Combined Scripts, Combined Stylesheets
combining six scripts into one eliminates five HTTP requests challenges: develop as separate modules number of possible combinations vs. loading more than needed maximize browser cache one solution: dynamically combine and cache
24
Preloading Download resources for the next page after the current page is done loading. Examples:
25
Image maps server-side client-side – preferred drawbacks:
<a href="navbar.cgi"><img ismap src="imagemap.gif"></a> → client-side – preferred <img usemap="#map1" border=0 src="/images/imagemap.gif"> <map name="map1"> <area shape="rect" coords="0,0,31,31" href="home.html" title="Home"> … </map> drawbacks: must be contiguous defining area coordinates – tedious, errors Client-side is preferred: provide visual feedback of what image portions are active, accessible to people with non-graphical browsers
26
Inline Images data: URL scheme not supported in IE
data:[<mediatype>][;base64],<data> <IMG ALT=”Red Star” SRC=""> not supported in IE avoid increasing size of HTML pages: put inline images in cached stylesheets Spec says "allows inclusion of small data items as 'immediate' data." Mostly used for images, but can be used anywhere a URL is specified: A and SCRIPT They say there are size limitations, but FF supports up to 100K base64 encoding increases size of images Since HTML pages are typically not cached, move inline images to cached stylesheets.
27
Rule 2: Use a CDN amazon.com Akamai aol.com cnn.com ebay.com Akamai, Mirror Image google.com msn.com SAVVIS myspace.com Akamai, Limelight wikipedia.org yahoo.com youtube.com distribute your static content before distributing your dynamic content backups, extended storage capacity, caching, absorb spikes drawbacks: affected by other sites, no direct control Homegrown solutions are also used. Cut 20% for Shopping.
28
Rule 2: Use a CDN Adding your CDN(s) to YSlow Go to about:config
Right-click in the window and choose New and String to create a new string preference. Enter extensions.firebug.yslow.cdnHostnames for the preference name. For the string value, enter the hostname of your CDN, for example, mycdn.com. Do not use quotes. If you have multiple CDN hostnames, separate them with commas. backups, extended storage capacity, caching, absorb spikes drawbacks: affected by other sites, no direct control Homegrown solutions are also used. Cut 20% for Shopping.
29
Rule 3: Add an Expires header
not just for images Images Stylesheets Scripts % with Expires Median Age amazon.com 0/62 0/1 0/3 0% 114 days aol.com 23/43 1/1 6/18 48% 217 days cnn.com 0/138 0/2 2/11 1% 227 days ebay.com 16/20 0/7 55% 140 days froogle.google.com 1/23 4% 454 days msn.com 32/35 3/9 80% 34 days myspace.com 0/18 1 day wikipedia.org 6/8 2/3 75% yahoo.com 23/23 4/4 100% n/a youtube.com 0/32 26 days One thought: you can't put an Expires header because the content is constantly changing. But when we look at the Last-Modified header we see that's not the case. For MySpace it is, but Amazon, CNN, Froogle and YouTube all have lengthier Last_modified values.
30
you can affect users' download times
Rule 4: Gzip components you can affect users' download times 90%+ of browsers support compression
31
Gzip: not just for HTML HTML Scripts Stylesheets amazon.com x aol.com some cnn.com ebay.com froogle.google.com msn.com deflate myspace.com wikipedia.org yahoo.com youtube.com gzip scripts, stylesheets, XML, JSON (not images, PDF) Images and PDF files are already compressed. Gzipping them wastes CPU and can increase file sizes.
32
Gzip vs. Deflate Gzip compresses more Gzip supported in more browsers
Size Savings Script 3.3K 1.1K 67% 66% 39.7K 14.5K 64% 16.6K 58% Stylesheet 1.0K 0.4K 56% 0.5K 52% 14.1K 3.7K 73% 4.7K
33
Gzip Configuration Apache 2.x: mod_deflate HTTP request HTTP response
AddOutputFilterByType DEFLATE text/html text/css application/x-javascript HTTP request Accept-Encoding: gzip, deflate HTTP response Content-Encoding: gzip Vary: Accept-Encoding The Vary header is required for proxies, so they don't server gzip to browsers that don't support it (and vice-versa). The Vary header is added automatically by mod_gzip (not sure about mod_deflate). needed for proxies
34
Gzip Edge Cases <1% of browsers have problems with gzip
IE 5.5: IE 6.0: Netscape 3.x, 4.x consider adding Cache-Control: Private remove ETags (Rule 13) hard to diagnose; problem getting smaller
35
Rule 5: Put stylesheets at the top
stylesheets block rendering in IE solution: put stylesheets in HEAD (per spec) avoids Flash of Unstyled Content use LINK Unfortunately, this is bad for FF because in FF stylesheets block parallel downloads.
36
Slowest is Fastest CSS at the bottom: CSS at the top:
resources load faster, but nothing renders CSS at the top: resources take longer, but render progressively right choice REMEMBER: IE ONLY! css-top seems the fastest, but actually it's the slowest to download all necessary components to render the page css-top-import shows that puts the stylesheet last thus delaying rendering @import at the top: same problems as bottom
37
Rule 6: Move scripts to the bottom
scripts block parallel downloads across all hostnames scripts block rendering of everything below them in the page script defer attribute is not a solution blocks rendering and downloads in FF slight blocking in IE
38
Rule 7: Avoid CSS expressions
used to set CSS properties dynamically in IE width: expression( document.body.clientWidth < 600 ? “600px” : “auto” ); problem: expressions execute many times mouse move, key press, resize, scroll, etc. alternatives: one-time expressions event handlers
39
One-Time Expressions expression overwrites itself <style> P {
background-color: expression(altBgcolor(this)); } </style> <script> function altBgcolor(elem) { elem.style.backgroundColor = (new Date()).getHours()%2 ? "#F08A00" : "#B8D4FF"; </script>
40
Event Handlers tie behavior to (fewer) specific events
window.onresize = setMinWidth; function setMinWidth() { var aElements = document.getElementsByTagName("p"); for ( var i = 0; i < aElements.length; i++ ) { aElements[i].runtimeStyle.width = ( document.body.clientWidth<600 ? "600px" : "auto" ); } Could do this only for MSIE browser type.
41
Rule 8: Make JS and CSS external
inline: HTML document is bigger external: more HTTP requests, but cached variables page views per user (per session) empty vs. primed cache stats component re-use external is typically better extra credit: post-onload download, dynamic inlining
42
Post-Onload Download inline in front page
download external files after onload window.onload = downloadComponents; function downloadComponents() { var elem = document.createElement("script"); elem.src = " document.body.appendChild(elem); ... } speeds up secondary pages
43
Dynamic Inlining start with post-onload download
set cookie after components downloaded server-side: if cookie, use external else, do inline with post-onload download cookie expiration date is key speeds up all pages
44
Rule 9: Reduce DNS lookups
typically ms block parallel downloads OS and browser both have DNS caches
45
Adding DNS Lookups Increasing parallel downloads is worth an extra DNS lookup.
46
TTL (Time To Live) TTL – how long record can be cached
1 minute 10 minutes 1 hour 5 minutes TTL – how long record can be cached browser settings override TTL
47
Browser DNS Cache IE Firefox DnsCacheTimeout: 30 minutes
KeepAliveTimeout: 1 minute ServerInfoTimeout: 2 minutes Firefox network.dnsCacheExpiration: 1 minute network.dnsCacheEntries: 20 network.http.keep-alive.timeout: 5 minutes Fasterfox: 1 hour, 512 entries, 30 seconds
48
Reducing DNS Lookups fewer hostnames – 2-4 keep-alive
49
Rule 10: Minify JavaScript
Minify External? Minify Inline? no yes froogle.google.com minify inline scripts, too
50
Minify vs. Obfuscate minify – it's safer Original JSMin Savings
Dojo Savings 204K 31K (15%) 48K (24%) 44K 4K (10%) 98K 19K (20%) 24K (25%) 88K 23K (27%) 24K (28%) 42K 14K (34%) 16K (38%) 34K 8K (22%) 10K (29%) Average 85K 17K (21%) 21K (25%) minify – it's safer minify instead of obfuscate: fewer bugs, less maintenance, easier to debug in production after gzipping, no difference between minification and obfuscation not much difference
51
Rule 11: Avoid redirects 3xx status codes – mostly 301 and 302
HTTP/ Moved Permanently Location: add Expires headers to cache redirects worst form of blocking example:
52
Redirects Redirects www.amazon.com no www.aol.com yes – secondary page
yes – initial page froogle.google.com
53
Avoid Redirects missing trailing slash mod_rewrite CNAMEs
use Alias or DirectorySlash mod_rewrite CNAMEs log referer – track internal links outbound links – harder beacons – beware of race condition XHR – bail at readyState 2 With alias, DirectorySlash and mod_rewrite you can't have URIs that are relative to the current directory.
54
Rule 12: Remove duplicate scripts
hurts performance extra HTTP requests (IE only) extra executions atypical? 2 of 10 top sites contain duplicate scripts team size, # of scripts extra HTTP requests happen in IE if the scripts are not cacheable or if the page is Reloaded
55
Script Insertion Functions
<?php function insertScript($jsfile) { if ( alreadyInserted($jsfile) ) { return; } pushInserted($jsfile); if ( hasDependencies($jsfile) ) { $dependencies = getDependencies($jsfile); for ( $i = 0; $i < count($dependencies); $i++ ) { insertScript($dependencies[$i]); } echo '<script type="text/javascript" src="' . getVersion($jsfile) . '"></script>"; ?> avoids dupes, makes sure dependencies are included in the right order, handles versioning could do combining
56
Rule 13: Configure ETags unique identifier returned in response
ETag: "c8897e-aee-4165acf0" Last-Modified: Thu, 07 Oct :54:08 GMT used in conditional GET requests If-None-Match: "c8897e-aee-4165acf0" If-Modified-Since: Thu, 07 Oct :54:08 GMT if ETag doesn't match, can't send 304 ETag format Apache: inode-size-timestamp IIS: Filetimestamp:ChangeNumber Use 'em or lose 'em Apache: FileETag none IIS: ETag == entity tag if-none-match overrides if-modified-since
57
Rule 14: Make AJAX cacheable
XHR, JSON, iframe, dynamic scripts can still be cached (and minified, and gzipped) a personalized response should still be cacheable for that person
58
AJAX Example: Yahoo! Mail Beta
address book XML request → GET /yab/[...]&r= HTTP/1.1 Host: us.xxx.mail.yahoo.com ← HTTP/ OK Date: Thu, 12 Apr :39:09 GMT Cache-Control: private,max-age=0 Last-Modified: Sat, 31 Mar :17:17 GMT Content-Type: text/xml; charset=utf-8 Content-Encoding: gzip address book changes infrequently cache it; add last-modified-time in URL 18K for me!
59
Next Rules Split static content across multiple domains
Reduce the size of cookies Host static content on a different domain Minify CSS Avoid IFrames in priority order addressing these rules improve response times
60
Case Studies If you could cut performance in half, FE changes would be 40-45%, while BE would be only 5-10%. BE changes are typically more complex: rearchitecture, optimize code, add/modify hw, distribute databases, etc. FE is simpler: change web server config, place scripts and stylesheets differently in the page, combine requests, etc. At Yahoo!, we've cut response times on 50 properties, often by 25% or more.
61
Case Study: move JS to onload remove bottom tabs avoid redirects
image sprites host JS on CDN combine JS files 40-50% top graph: response times over dialup bottom graph: response times over broadband
62
What about performance and Web 2.0 apps?
client-side CPU is more of an issue user expectations are higher these rules still apply, new rules will come out start off on the right foot
63
Case Study: Mail Classic Mail User Workflow mail.yahoo.com
view inbox folder read messages (x3) compose message confirm send total time: Time 2.40 s 4.98 s 6.39 s 2.21 s 2.10 s 18.08 s Time 12.48 s 1.52 s 1.53 s 0.34 s 0s 15.87 s Delta +420% -70% -76% -85% -100% -12%
64
Evangelism If you could cut performance in half, FE changes would be 40-45%, while BE would be only 5-10%. BE changes are typically more complex: rearchitecture, optimize code, add/modify hw, distribute databases, etc. FE is simpler: change web server config, place scripts and stylesheets differently in the page, combine requests, etc. At Yahoo!, we've cut response times on 50 properties, often by 25% or more.
65
Evangelism Book Conferences Blogs Open Source YSlow
High Performance Web Sites Conferences Yahoo! F2E Summit Web 2.0 Expo Rich Web Experience Blogs YUI Blog: YDN Blog: Open Source YSlow OSCon Ajax Experience Blogher Future of Web Apps halo effect helps all web users documentation archive hiring communications w/in the company perceived value quality of end product (book, code)
66
YSlow http://developer.yahoo.com/yslow performance lint tool
scores web pages for each rule Firefox add-on integrated with Firebug open source license Knew we couldn't reach all the development teams at Yahoo!. Needed to teach them to fish.
68
Ten Top U.S Web Sites Page Weight Response Time YSlow Grade
405K 15.9 sec D 182K 11.5 sec F 502K 22.4 sec 275K 9.6 sec C froogle.google.com 18K 1.7 sec A 221K 9.3 sec 205K 7.8 sec 106K 6.2 sec 178K 5.9 sec 139K
69
Strong Correlation correlation(resp time, page weight) = 0.94
total page weight response time inverse YSlow grade Yahoo! doesn't quite follow the curve. It has the 2nd best YSlow grade[1] and response time, even though it's the 4th heaviest page. The Yahoo! front page team is a long-time consumer of these performance best practices, and therefore scores well in YSlow and is able to squeeze more speed out of their page. Amazon's YSlow grade also doesn't reflect the page weight and response time. The main reason for this is the large number of images in their page (approximately 74 images). YSlow doesn't subtract points for images, so the Amazon page scores well but performs slowly. Web 2.0 challenges correlation(resp time, page weight) = 0.94 correlation(resp time, inverse YSlow) = 0.76
70
Live Analysis If you could cut performance in half, FE changes would be 40-45%, while BE would be only 5-10%. BE changes are typically more complex: rearchitecture, optimize code, add/modify hw, distribute databases, etc. FE is simpler: change web server config, place scripts and stylesheets differently in the page, combine requests, etc. At Yahoo!, we've cut response times on 50 properties, often by 25% or more.
71
IBM Page Detailer packet sniffer Windows only IE, FF, any .exe
c:\windows\wd_WS2s.ini Executable=(NETSCAPE.EXE),(NETSCP6.EXE),(firefox.exe) free trial, $300 license
73
Firebug web development evolved inspect and edit HTML
tweak and visualize CSS debug and profile JavaScript monitor network activity (caveat) Firefox extension free
74
Takeaways focus on the frontend harvest the low-hanging fruit
you do control user response times small investment up front keeps on giving LOFNO – be an advocate for your users
75
Steve Souders
76
CC Images Used "Need for Speed" by Amnemon: "Max speed 15kmh" by xxxtoff: "maybe" by Tal Bright: "takeout" by dotpolka: "how do they do that" by Fort Photo: "Absolutely Nothing is Allowed Here" by Vicki & Chuck Rogers: "Zipper Pocket" by jogales: "new briefcase" by dcJohn: "Told you it was me!" by Pug!: "Robert's Legion" by dancharvey: "thank you" by nj dodge:
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.