1 / 40

Windows Azure Storage: How It Works, Best Practices, and Future Directions

AZR306. Windows Azure Storage: How It Works, Best Practices, and Future Directions. Jai Haridas Development Manager Microsoft Corporation. Agenda. Windows Azure Storage – A Quick Introduction What is new? Key Concepts Best Practices Future direction. Windows Azure Storage. Introduction.

abrienda
Download Presentation

Windows Azure Storage: How It Works, Best Practices, and Future Directions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. AZR306 Windows Azure Storage: How It Works, Best Practices, and Future Directions Jai Haridas Development Manager Microsoft Corporation

  2. Agenda • Windows Azure Storage – A Quick Introduction • What is new? • Key Concepts • Best Practices • Future direction

  3. Windows Azure Storage Introduction

  4. Windows Azure Storage • Cloud Storage - Anywhere and anytime access • Highly Durable, Available and Massively Scalable • Easily build “internet scale” applications. • Over 300PB of raw storage today and growing! • Bing Ingestion Engine – More than 40K requests/second at peak • Pay for what you use • Exposed via easy and open REST APIs • Client libraries in .NET, Java, Node.js etc.

  5. Abstractions • Blobs – Simple interface to store and retrieve files in cloud • Enterprise document share; Social networking sites- share pictures, video etc. • Big Data Analysis - Store raw data and benefit from large compute available in cloud • Backups – device, computer backups • Tables – Extremely easy to use NoSQLsystem that auto scales • Structured data at massive scale • User registration and content metadata • Key-value lookups at scale. • Queues – Scalable, reliable and persistent messaging system • Decouple components – web role to worker role communication • Building process flows – order processing system • Disks and Drives – Network mounted durable drives • Fixed formatted VHDs stored as page blobs

  6. Windows Azure Blobs http://<account>.blob.core.windows.net/ Account Blobs Container PIC01.JPG images contoso PIC02.JPG videos VID1.AVI

  7. Windows Azure Blobs • Create, Delete, SetAcl Containers • Put/Get Blobs/Set Metadata • Parallel uploads & Single range gets • Copy Blob • Asynchronous copy – copy between accounts • Snapshots • Read only version of a blob • Promote a version as base blob • Sharing Scenarios • Private access, public access or Shared Access Signatures (Signed Url) • Lease on a container or blob • Useful for master election scenarios • Infinite leases - Locks

  8. Windows Azure Tables http://<account>.table.core.windows.net/ Account Entities Table PartitionKey=‘uid1’ RowKey=‘B:002’ Rating=‘2’ Blogs contoso PartitionKey=‘uid1’ RowKey=‘C:002:003’ Committed=‘1’ Videos PartitionKey=‘uid2’ RowKey=‘pic.wmv’ Rating=‘1’

  9. Windows Azure Tables • NoSQL Schema-less structured storage • OData Protocol – WCF data service client library for .NET • Create, Delete, SetAcl Tables • Insert/Update/Upsert/Delete Entities • Query • Single Entity Lookups • Range queries via filter • Sharing Scenarios • Private access or Shared Access Signatures (Signed Url)

  10. Windows Azure Queues http://<account>.queue.core.windows.net/ Account Messages Queue CustomerId=41 OrderId=O21 orders contoso CustomerId=12 OrderId=O1 imageprocessing BlobUrl=http://contoso.blob…

  11. Windows Azure Queues • Create, Delete, SetAcl Queues • Enqueue Messages – 64KB messages • Lease mechanism for processing • Get Message(s) – provide lease time • Delete Message • Metadata • Message count – Enables auto scaling • Dequeue count – Detect poison messages • Sharing Scenarios • Private access or Shared Access Signatures (Signed Url)

  12. Windows Azure Disks & Drives • Move legacy applications easily to cloud • IaaS • Disks - OS and multiple data disks associated with VM. Network mounted durable drives • PaaS • Drives – Dynamically mount network attached single volume VHDs • Drives & Disks • Fixed formatted VHD • Stored using page blobs • All flushed and un-buffered writes are made durable • Transactions on blobs like PutPage, GetBlob (ranges) etc. are counted towards storage account

  13. announcing Awesomeness

  14. Price Reduction • 90% price reduction for transactions!!!! • 1 cent – 100K Transactions • If you poll every second for a queue message • 0.008$ for 1 day • Tiered Pricing for storage • Find more at http://www.windowsazure.com/en-us/pricing/details/

  15. Locally Redundant Storage Geo Redundant Storage • Higher durability • 3 local replicas in primary location • Local replicas – synchronously replicated • Common failures (disk, node, rack) – use local copies to recover • Major disasters – contact customer about potential data loss • Reduced Price – 23-34% based on how much you store • Turn off Geo for your storage account in portal • Non-critical data that can be recreated on major disasters • Application manages its own replica • Companies have limitations on geo locations • Highest level of durability • 3 local replicas each in primary and secondary locations • Local replicas – synchronously replicated • Geo replica – asynchronously replicated • Common failures (disk, node, rack) – use local copies to recover • Major disasters – use geo replicated copy (400+ miles apart) • Price remains the same as before • Enabled by default

  16. New REST API Version – 2012-02-12 • Copy Blob - asynchronous • Copy blobs between storage accounts • Leases • Infinite leases – Lock • Leases on container • Shared Access Signatures (Signed Url) • One hour expiry time limit removed for SAS • SAS for table and queues • Mobile applications can request SAS

  17. demo Configure Analytics via Portal

  18. Windows Azure Storage Key Concepts

  19. Key Concepts - Indexes • Objects are partitioned for scale based on partitioning key • Blobs - Account Name, Container Name, BlobName • Tables - Account Name, Table Name, PartitionKey • Queues - Account Name, Queue Name • Object Key exists for every object in addition to partitioning key • Blobs - Account Name, Container Name, Blob Name, Snapshot Time • Tables - Account Name, Table Name, PartitionKey,RowKey • Queues - Account Name, Queue Name, Message Visibility Time, Message Id

  20. Key Concepts – How are objects stored? • Objects are sorted by their key • Lexically sorted • 9 > 10 but 09 < 10 • Prefix in key is more important in determining locality • Example: Logs – YYYYMMDD-HHMM-ddd.log vs. ddd-YYYYMMDD-HHMM.log

  21. Key Concepts – How does storage scale? • Range based partitioning system • All objects with same partition key => Single partition • System Automatically load balances based on traffic load • Assume partition key range is from 0000 - 9999 0000-9999 0000-0200 0201-9999 0091-0200 0000-0090 A A B C A

  22. Key Concepts – Scalability Targets • Multi tenancy system with isolation logic in place • Scalability targets for single account and partition exists to control isolation • Account targets • 100 TB storage capacity per account • 5000 objects/sec per account • 3 Gbps per account • Partition Targets • 500 objects/sec per partition • 60 MB/s per partition • Query • Smaller scans are efficient and larger scans can result in more roundtrips • Smaller scans provide better consistency than larger scans

  23. Windows Azure Storage Best Practices

  24. Storage Accounts • Collocate storage accounts with your compute roles • Egress is free within same region • Use multiple storage accounts when • Scale targets exceed a single storage account • Client proximity – Presence of clients worldwide • Map multiple clients to same storage account • Use different containers/tables/queues instead for each customer • Design to add more accounts as needed • Use different account for Windows Azure Diagnostics • Choose local redundant storage if your data is • Not critical and can be restored on major disasters • Geographical boundary constraints on where data can be stored

  25. Common Design & Scalability • Common Settings • Turn off Nagling & Expect 100 (.NET –ServicePointManager) • Set connection limit (.NET –ServicePointManager.DefaultConnectionLimit) • Turn off Proxy detection when running in cloud (.NET – Config: autodetect setting in proxy element) • Distribute your requests across your logical range of partition values for scale • Avoid Append/Prepend pattern • Requests target objects in sorted order of keys • Perform one time operations at startup rather than every request • Creating containers/tables/queues which should always exist • Setting required constant ACLs on container/table/queue • Cache latency sensitive requests on objects that rarely change • Use appropriate retry policy for intermittent errors • Storage client uses exponential retry by default

  26. Blob Design & Scalability • How to upload a single large blob as fast as possible? • Use parallel block upload and then commit • Storage client library – CloudBlobClient’sSingleBlobUploadThresholdInBytes and ParallelOperationThreadCount • How to upload multiple blobs as fast as possible? • Use single thread for each blob but upload multiple blobs in parallel • Migrate blobs between accounts using new asynchronous Copy Blob • List various containers in source account to avoid “Append only” • Use Windows Azure CDN to efficiently deliver content to users worldwide • Future: Accommodate better “Append/Prepend” blob writes

  27. Table Design & Scalability • Design Process • Critical queries: Select PartitionKey, RowKey to improve performance • Entities that need to be updated together: Same PartitionKey to allow batch • Schema-less: Store multiple types in same table • Concatenate columns to form keys • Be aware of entity locality • (PartitionKey, RowKey) determine sort order • Stored together to reduce IO and improve performance • Query Performance • Performance depends on: how many entities the service needs to iterate and not how many entities match the query filter • Order of performance • Single entity lookups are fastest – Great for latency sensitive scenarios • Small range queries – Lookout for continuation tokens • Larger range queries – Good for latency insensitive background workers; Lookout for continuation tokens • Do not reuse DataServiceContext/TableServiceContext across logical operations

  28. Queue Design & Scalability • Use visibility time as a lease on message with small lease time • Update visibility time based on processing needed • Make message processing idempotent • Batch Get - increase message processing throughput • Use “Message Count” to scale workers • Use “Dequeue Count” on message to handle poison messages • Use “Update Message” API to save intermittent processing state • Use blobs to store messages that exceed 64KB • Use multiple queues to scale beyond the published targets

  29. Shared Access Signatures (SAS) • Use HTTPS • Securely use/transport SAS tokens • Use minimum permissions needed and restrict time period for access • Clock Skew - Clients should renew SAS token with sufficient leeway • Revocable SAS - Use policy to store SAS permissions, expiry etc. • Only 5 policies can be associated with container • When removing and recreating policies, change policy IDs

  30. Diagnose Storage Requests • Take control of your diagnostics • Wireshark and Fiddler are your friends • Turn Analytics On – Logging & Metrics • Analytics Data is stored using different namespace within your account • Learn from your application behavior • Monitor your storage account usage • End to End tracing • Portal allows configuring analytics • Turn on retention policy • Storage service will delete old logs and table data automatically!

  31. demo Analytics Demo

  32. Windows Azure Storage Future Direction

  33. Future • Geo • Read from Secondary • Control your own failover • Test how your application behaves • Compliance reasons • SLA for Geo replication • RPO and RTO • 1.8 Client library & Storage Emulator • Ship with SDK & Supports 2012-02-12 version • Better cloud parity with Development Storage Emulator with support for 2012-02-12 version

  34. Track Resources • Storage team blogs @ http://blogs.msdn.com/b/windowsazurestorage/ Pricing information @ https://www.windowsazure.com/en-us/pricing/details/ Getting Started @ https://www.windowsazure.com/en-us/develop/overview/ Storage 1.7.1 @ https://github.com/WindowsAzure/azure-sdk-for-net/tree/sdk_1.7.1

  35. Track Resources @WindowsAzure @ms_teched Hands-On Labs Meetwindowsazure.com DOWNLOAD Windows Azure Windowsazure.com/ teched

  36. Resources Learning TechNet • Connect. Share. Discuss. • Microsoft Certification & Training Resources http://northamerica.msteched.com www.microsoft.com/learning • Resources for IT Professionals • Resources for Developers • http://microsoft.com/technet http://microsoft.com/msdn

  37. Required Slide Complete an evaluation on CommNet and enter to win!

  38. MS Tag Scan the Tag to evaluate this session now on myTechEd Mobile

  39. © 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

More Related