Download presentation
Presentation is loading. Please wait.
Published byDwain Payne Modified over 8 years ago
1
ARCHITECTING APPLICATIONS FOR HIGH SCALABILITY Leveraging the Windows Azure Platform Scott Densmore Sr. Software Development Engineer Microsoft patterns & practices
2
ABOUT YOU (AN ASSUMPTION) You… are a developer know C# have a basic understanding of Windows Azure
3
GOALS FOR THIS SESSION Learn what is available in Windows Azure to help you build scalable systems (Re)-Discover helpful design patterns Learn about practical techniques Identify (and avoid) potential problems
4
TAILSPIN
5
DEMO TailSpin Surveys
6
TAKE THE SURVEY http://tailspindemo.cloudapp.net/survey/fabrikam/slovenia
7
LOCATION Where should my application live?
8
GEO-LOCATION
9
WINDOWS AZURE TRAFFIC MANAGER
10
50ms
11
50ms 100ms
12
50ms 100ms 200ms
14
Load balancing across multiple Hosted Services Integrated in the Windows Azure Platform portal Performance Directs the user to the best / closest deployment Fault Tolerance Redirect traffic to another deployment based on availability Round Robin Traffic routed to deployments based on fixed ratio
15
WINDOWS AZURE TRAFFIC MANAGER Multiple factors determine DNS resolution Configured by Microsoft Geo-IP mapping Periodic performance measurement Configured by service owner Policy: Performance, Failover, Geo, Ratio Monitoring Currently in CTP
16
WINDOWS AZURE CDN Integrated with Storage Delivery from Windows Azure Compute instances Https support CTP of Smooth Streaming
17
LEVERAGING THE CDN
19
MANAGING CDN CONTENT EXPIRATION Default behavior is to fetch once and cache for up to 72 hrs Modify cache control blob header to control the TTL x-ms-blob-cache-control: public, max- age= Think hours, days or weeks Higher numbers reduce cost and latency via CDN & downstream caches
20
MANAGING CDN CONTENT EXPIRATION Enables easy rollback and A/B testing Use versioned URLs to expire content on- demand logo.2011-05-01.png logo.2011-05-29.png logo.2011-05-01.png logo.2011-05-29.png
21
IDENTITY Who is using my application?
22
IDENTITY
24
SHARED ACCESS SIGNATURES Provide direct access to content Can be time-bound or revoked on demand Also works for write access (e.g. user-generated content)
25
SHARED ACCESS SIGNATURES Hosted Compute X X Non-public blob (e.g. paid or ad- funded content) Non-public blob (e.g. paid or ad- funded content) 1. “I am Bob & I want X” 2. Service prepares a Shared Access Signature (SAS) to X using the securely stored storage account key 3. Service returns SAS (signed HTTPS URL) 4. Bob uses SAS to access X directly from Blob Storage for reduced latency & compute load
26
BALANCING LOAD Where is the bottleneck?
27
USER SESSION Session is not affinitized – Load Balancer Session in Windows Azure Session Providers SQL Azure Table Storage Windows Azure AppFabric Caching JavaScript on the client ViewState (hidden fields)
28
WINDOWS APPFABRIC CACHING Out of box ASP.NET providers for session state & page output caching Extreme low latency with the local cache Local cache enables you to use spare available memory in your Web tier while the Caching tier gives you a predictable distributed cache
29
WINDOWS APPFABRIC CACHING Caches any managed object (CLR objects, rows, XML, Binary Data…) Only requirement is that the object should be serializable Easily integrates into existing applications Same managed interfaces as Windows Server AppFabric Caching Secured by the Access Control Service
30
KEY CACHING PATTERNS Reference Data A version of the authoritative data, refreshed periodically Large number of accesses, mostly read Example – Product catalogs Activity-oriented Data Data generated as part of the app activity, typically logged back to a backend datastore Needs read, write access Example – Shopping cart, Session State Resource-oriented Data Authoritative data, modified by transactions, temporal in nature Needs frequent read, limited write access Example – Flight Inventory, Stock Quotes
31
PARTITION THE APPLICATION Multiple web sites Choose the right number of instances and instance size Monitor and scale your application without redeploying Use async processing (Worker Roles)
32
FUNDAMENTAL DESIGN PATTERN
33
DELAYED PROCESSING
34
CALCULATING SURVEY RESULTS Two approaches Retrieve all the surveys to date at a fixed time interval, recalculate and then save the summary data over the existing data Retrieve the survey data since the last time the task ran and update the summary results
35
CALCULATING SURVEY RESULTS
36
MAP REDUCE ALGORITHM Original concepts come from map and reduce functions used in functional languages (Haskell, F#, Erlang) Parallelize operations on a large dataset and speeds up processing by using multiple compute nodes Dryad is Microsoft’s implementation
37
DATA STORAGE
38
TAILSPIN SURVEYS DATA MODEL
39
SQL AZURE Partition (or shard) your data across databases Spreads load across multiple database instances Avoid hitting database size limits Parallelized queries across more nodes Improved query performance on commodity hardware Partitioning scheme varies per data set
40
SQL AZURE Hosted Compute Tenant 1 Tenant 3 Tenant 2
41
TABLE STORAGE Don’t be afraid to de-normalize data Only two indexes in a table Partition Key Row Key They are not really tables, think of them as Entity bags (key / value storage)
42
PAGING WITH TABLE STORAGE Use the ContinuationToken along with the Take operation in your query The ContinuationToken only accesses the next page of data To implement forward and back you will need a stack of ContinuationTokens
43
PAGING WITH TABLE STORAGE
44
TABLE STORAGE BEST PRACTICES Limit large scans and expect continuation tokens for queries that scan Entity Group Transaction - Batch to reduce costs and get transaction semantics Do not reuse DataServiceContext across multiple logical operations Discard DataServiceContext on failures
45
TABLE STORAGE BEST PRACTICES AddObject/AttachTo can throw exception if entity is already being tracked Query throws an exception if resource does not exist. Use IgnoreResourceNotFoundException
46
BLOB STORAGE Blobs can be anything Pictures, docs, etc. Html XML JSon objects
47
BLOB STORAGE
49
PAGING WITH BLOB STORAGE Each item (survey answer) is stored as a blob (json) in a container A blob is used to maintain a list of the items (survey answers) as they were entered by id Use an inverted tick count to generate the id of the answer to make it unique and ordered
50
BLOB STORAGE BEST PRACTICES Use parallel block upload count to reduce latency when uploading blob Client Library uses a default of 90s timeout – use size based timeout Snapshots – For block or page reuse, issue block and page uploads in place of UploadXXX methods in Storage Client
51
BLOB STORAGE BEST PRACTICES Shared Access Signature Use container level policy as it allows revoking permissions Share SAS URLs using https Create new containers for blobs like log files that have retention period Delete logs after 1 month - create new containers every month. Container recreation Garbage collection can take time until which time container with same name cannot be created. Use unique names for containers
52
RESOURCES Books http://wag.codeplex.com Products http://www.microsoft.com/windowsazure http://research.microsoft.com/en-us/projects/Dryad/ Me scottden@microsoft.com @scottdensmore http://scottdensmore.typepad.com
53
QUESTIONS? After the session please fulfil the questionnaire. Questionnaires will be sent to you by e-mail and will be available in the profile section of the NT Conference website www.ntk.si.www.ntk.si Thank you!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.