GENERAL SCALABILITY CONSIDERATIONS
Overview of scalability As the number of users grows, maintain: – Low latency – High throughput – High reliability
Latency Latency = total time between when an operation is initiated and when the operation completes Latency (measured in seconds) Responsiveness (measured in seconds)
Throughput Throughput = number of operations completed per unit time Web page Web server Web page 10/min 4/min 2/min Throughput: 32/minute
Reliability Reliability = percentage of operations successfully completed Web page Web server Web page 1/10 failure 0/10 failure 2/4 failure 0/2 failure Reliability: 29/32 = 90%
Scalability Scalability means that even when the number of users grows into the thousands or millions, your website still maintains – Low latency – High throughput – High reliability
Very rough reasonable goals Reasonable # "simultaneous" users LatencyThroughputReliability One single-core server Hundreds or maybe thousands Low hundreds of milliseconds A few hundred operations per second 99% One multi-core server Thousands or tens of thousands Around 100 milliseconds Thousands of operations per second 99% A cluster of a few multi-core computers Tens or hundreds of thousands Under 100 milliseconds Tens of thousands of operations per second 99.99% A small datacenter with a few dozen multi-core computers MillionsA few dozen milliseconds (assuming a great network connection) Hundreds of thousands of operations per second %
Techniques to improve scalability Minimal size messages Minimal number of messages Minimal amount of computation Local computation Replication Aggressive caching Aggressive indexing
Minimal size of messages When client-server communicate… – Only send data needed at that moment – Use a concise data format (i.e., probably JSON) For example, suppose that an app needed to retrieve a list of courses in response to a query in order to show a list of links –
Option #1 565 bytes CS 361 cscaffid Intro to SE Blah blah blah blah blah blah blah blah blah CS 494 cscaffid Web development Blah blah blah blah blah blah blah blah blah CS 496 cscaffid Cloud+Mobile development Blah blah blah blah blah blah blah blah blah
Option #2 108 bytes [{n:"CS361",t:"Intro to SE"}, {n:"CS494",t:"Web development"}, {n:"CS496",t:"Cloud+Mobile development"}] 1.Combine fields if appropriate (e.g., dept and number) 2.Omit fields if not needed (e.g., description) 3.Shorten field names if appropriate (e.g., n and t) 4.Use JSON if feasible 1.Combine fields if appropriate (e.g., dept and number) 2.Omit fields if not needed (e.g., description) 3.Shorten field names if appropriate (e.g., n and t) 4.Use JSON if feasible
Minified JS and CSS Online services for squeezing the whitespace and other wasted characters out of your JS – Search for JS "minifier" or "minimizer" – E.g., Ditto for CSS – E.g.,
Minimal number of messages Eliminate unnecessary messages – E.g., eliminate unnecessary images from UI Combine messages if feasible – E.g., if you need to query CS and ECE courses, design server to handle both queries at once Defer messages if feasible – E.g., give the user the option to defer logging in until it’s absolutely necessary
Minimal amount of computation Avoid "feature bloat" – Only implement the features you need – This also will enhance usability! Avoid blithely copy-pasting code – E.g., It's simplest to do certain things at the top of every web page in your site (send JS, open db) even when each page doesn't actually need this
Minimal amount of computation Use the right data structures – E.g., If you need to use an associative array, then use an associative array Use the right APIs – E.g., There is an AJAX API for retrieving JSON as an object – don't try to write such an API yourself Your version will be buggy and slow!
Minimal amount of computation Retrieve only the data you need – E.g., if you need one row, use a WHERE clause in SQL (rather than retrieving all rows & looping) Looping just creates unnecessary computation! Use SQL aggregate functions when practical Duh
Local computation If a computation uses a very large amount of data, then move the computation to the data, instead of the data to the computation. Example: Find city with maximal rainfall in US Option #1: – Server sends rainfall for 4500 cities to browser – Browser loops through cities to choose maximum Option #2: – Server loops through cities to choose maximum – Server sends just the maximum to the browser
Replication Make copies of your computation and data Web page Web server 10/min 4/min 2/min Throughput: 32/minute Web server 10/min 4/min 2/min Web server
Replication You also can replicate your database Web server Database Databases can be configured to automatically "mirror" contents
Shopping for a hosting service When leasing space from a "hosting service" – You pay them $X per month – They let you use Y machines If you want replication, look for… – Load balancing: automatic routing of traffic evenly across the machines you lease – Mirroring: automatic copying of data updates from one server to another ("master/slave") – Failover: automatic routing (and restart) around machines that crash
Learning about replication If you really want to get your hands dirty with the details of replication… – CS496: Mobile + Cloud Software Development – CS440: (Advanced) Database Management
Aggressive caching & indexing Caching: If a computation or transmission is expensive, then do it once, save the result, and reuse the result later Indexing: If you have lots of data, create a data structure that makes it easier to find the data These will each be covered by a whole lecture