1 / 53

Architecting Applications for High Scalability

Architecting Applications for High Scalability. Leveraging the Windows Azure Platform Scott Densmore Sr. Software Development Engineer Microsoft patterns & practices. About you (an assumption). You… are a developer know C# have a basic understanding of Windows Azure.

adamma
Download Presentation

Architecting Applications for High Scalability

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Architecting Applications for High Scalability Leveraging the Windows Azure Platform Scott Densmore Sr. Software Development Engineer Microsoft patterns & practices

  2. About you (an assumption) • You… • are a developer • know C# • have a basic understanding of Windows Azure

  3. Goals for this session Learn what is available in Windows Azure to help you build scalable systems (Re)-Discover helpful design patterns Learn about practical techniques Identify (and avoid) potential problems

  4. tailspin

  5. DEMO TailSpin Surveys

  6. Take the survey http://tailspindemo.cloudapp.net/survey/fabrikam/slovenia

  7. Where should my application live? Location

  8. Geo-location

  9. Windows Azure Traffic Manager

  10. Windows Azure Traffic Manager 50ms

  11. Windows Azure Traffic Manager 100ms 50ms

  12. Windows Azure Traffic Manager 200ms 100ms 50ms

  13. Windows Azure Traffic Manager

  14. Windows Azure Traffic Manager Fault Tolerance Round Robin Performance Redirect traffic to another deployment based on availability Traffic routed to deployments based on fixed ratio Directs the user to the best / closest deployment Load balancing across multiple Hosted Services Integrated in the Windows Azure Platform portal

  15. Windows Azure Traffic Manager • Multiple factors determine DNS resolution • Configured by Microsoft • Geo-IP mapping • Periodic performance measurement • Configured by service owner • Policy: Performance, Failover, Geo, Ratio • Monitoring • Currently in CTP

  16. Windows azure cdn Integrated with Storage Delivery from Windows Azure Compute instances Https support CTP of Smooth Streaming

  17. Leveraging the cdn

  18. Leveraging the cdn

  19. Managing CDN Content Expiration • Default behavior is to fetch once and cache for up to 72 hrs • Modify cache control blob header to control the TTL • x-ms-blob-cache-control: public, max-age=<value in seconds> • Think hours, days or weeks • Higher numbers reduce cost and latency via CDN & downstream caches

  20. Managing CDN Content Expiration HTML Served by App CDN Blob Storage <imgsrc="http://azXXXX.vo.msecnd.net/images/logo.2011-05-29.png"/> logo.2011-05-01.png logo.2011-05-01.png logo.2011-05-29.png logo.2011-05-29.png Enables easy rollback and A/B testing Use versioned URLs to expire content on-demand

  21. Who is using my application? identity

  22. Identity

  23. Identity

  24. Shared access signatures • Provide direct access to content • Can be time-bound or revoked on demand • Also works for write access (e.g. user-generated content)

  25. Shared access signatures 2. Service prepares a Shared Access Signature (SAS) to X using the securely stored storage account key 1. “I am Bob & I want X” Hosted Compute Key 3. Service returns SAS (signed HTTPS URL) 4. Bob uses SAS to access X directly from Blob Storage for reduced latency & compute load Non-public blob (e.g. paid or ad-funded content) X

  26. Where is the bottleneck? Balancing load

  27. User session • Session is not affinitized – Load Balancer • Session in Windows Azure • Session Providers • SQL Azure • Table Storage • Windows Azure AppFabric Caching • JavaScript on the client • ViewState (hidden fields)

  28. Windows appfabric caching Out of box ASP.NET providers for session state & page output caching Extreme low latency with the local cache Local cache enables you to use spare available memory in your Web tier while the Caching tier gives you a predictable distributed cache

  29. Windows appfabric caching • Caches any managed object (CLR objects, rows, XML, Binary Data…) • Only requirement is that the object should be serializable • Easily integrates into existing applications • Same managed interfaces as Windows Server AppFabric Caching • Secured by the Access Control Service

  30. Key Caching Patterns • Reference Data • A version of the authoritative data, refreshed periodically • Large number of accesses, mostly read • Example – Product catalogs • Activity-oriented Data • Data generated as part of the app activity, typically logged back to a backend datastore • Needs read, write access • Example – Shopping cart, Session State • Resource-oriented Data • Authoritative data, modified by transactions, temporal in nature • Needs frequent read, limited write access • Example – Flight Inventory, Stock Quotes

  31. Partition the application • Multiple web sites • Choose the right number of instances and instance size • Monitor and scale your application without redeploying • Use async processing (Worker Roles)

  32. Fundamental design pattern

  33. Delayed processing

  34. Calculating survey results • Two approaches • Retrieve all the surveys to date at a fixed time interval, recalculate and then save the summary data over the existing data • Retrieve the survey data since the last time the task ran and update the summary results

  35. Calculating survey results

  36. Map reduce algorithm Original concepts come from map and reduce functions used in functional languages (Haskell, F#, Erlang) Parallelize operations on a large dataset and speeds up processing by using multiple compute nodes Dryad is Microsoft’s implementation

  37. Data storage

  38. TailSpin Surveys Data Model

  39. SQL Azure • Partition (or shard) your data across databases • Spreads load across multiple database instances • Avoid hitting database size limits • Parallelized queries across more nodes • Improved query performance on commodity hardware • Partitioning scheme varies per data set

  40. Sql azure Tenant 1 Tenant 2 Hosted Compute Tenant 3

  41. Table storage • Don’t be afraid to de-normalize data • Only two indexes in a table • Partition Key • Row Key • They are not really tables, think of them as Entity bags (key / value storage)

  42. Paging with table storage Use the ContinuationToken along with the Take operation in your query The ContinuationToken only accesses the next page of data To implement forward and back you will need a stack of ContinuationTokens

  43. Paging with table storage

  44. Table storage best practices Limit large scans and expect continuation tokens for queries that scan Entity Group Transaction - Batch to reduce costs and get transaction semantics Do not reuse DataServiceContext across multiple logical operations Discard DataServiceContext on failures

  45. Table storage best practices AddObject/AttachTo can throw exception if entity is already being tracked Query throws an exception if resource does not exist. Use IgnoreResourceNotFoundException

  46. Blob storage • Blobs can be anything • Pictures, docs, etc. • Html • XML • JSon objects

  47. Blob storage

  48. Blob storage

  49. Paging with blob storage Each item (survey answer) is stored as a blob (json) in a container A blob is used to maintain a list of the items (survey answers) as they were entered by id Use an inverted tick count to generate the id of the answer to make it unique and ordered

  50. Blob storage best practices • Use parallel block upload count to reduce latency when uploading blob • Client Library uses a default of 90s timeout – use size based timeout • Snapshots – For block or page reuse, issue block and page uploads in place of UploadXXX methods in Storage Client

More Related