1 / 47

April 25, 2016

OpenStack Swift 101 Technology & Architecture Albert Chen – Systems Engineer Eric Rife – Senior Systems Engineer. April 25, 2016. Who Are We ? Why A re W e H ere ?. New Breed of Object Storage Deflation in the storage market Huge growth of AWS S3 storage

rgarner
Download Presentation

April 25, 2016

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. OpenStack Swift 101Technology & Architecture Albert Chen – Systems Engineer Eric Rife – Senior Systems Engineer April 25, 2016

  2. Who Are We? • Why Are We Here? • New Breed of Object Storage • Deflation in the storage market • Huge growth of AWS S3 storage • Software-Defined & Commoditization

  3. Agenda What Is an Object? Why Use Swift? What’s Under The Hood? Use Case • Backup • Big Data / Map Reduce • Media & Entertainment • Life Science Hands on Lab

  4. Object eh? • What is an object??

  5. What is an Object File: 15x6x3_10x00x41741x64561_21x58x511_n.jpg Metadata: Image size: 1516x2048 Date taken: 2013-12-27 13:19 Tags: Asia, Japan, Bamboo forest, Arashiyama GPS: 35.016520, 135.670436 Object = File + Metadata

  6. Well… I have a lot of Objects

  7. Object eh? • What can I do with it?

  8. What is for Dinner? Metadata Search: Taiwanese Beefy Noodle Metadata info is stored with: Object Searchable Index This allows for meta data search

  9. What is for Dinner? Metadata Search: Taiwanese Beefy Noodle

  10. But Why Swift? • I like objects but what does Swift have to do with it?

  11. Why you need Object? Traditional - File-based Object Storage - HTTP Namespace Mobile Devices App Large File / Big Data Software-as-a-Service Apps App App

  12. Right Tool for the Right Job • Application-scale • Site-scale • Web-scale

  13. Examples of Object Storage: Object Storage Services: Amazon S3 Rackspace Cloud Files HP Cloud Object Storage IBM Softlayer Object Storage Object Storage Systems: OpenStack Object Storage System (“Swift”)

  14. Why Choose OpenStack Swift & Swift API Swift API accessible via HTTP Applications can consume storage from anywhere Larger range of functionality Puts the developer in control Swift Object Storage superior to Traditional Storage Designed to scale from TB -> PB -> EB Replication is automatic across Nodes, Zones, and Regions Supports both Replicas & Erasure Coding No Single Points of Failure -> Lose Nodes, Racks or Data Centers Ingress / Egress data from all Proxies - No Masters Balance Heterogeneous Commodity Hardware

  15. What’s Under the Hood? • Take a closer look at the characteristics and different components of Swift

  16. Access Method: RESTful http API Object Storage API Operations for Objects: http Load-Balancer http http Object Storage Node Object Storage Node There are also API operations for Accounts and Containers.

  17. Every Object Has a URL Base URL API Version http://example.com/v1/account/container/object • Namespaces used to group objects within an account • Containers are unlimited • Like folders, but can’t nest them • Each account has its own URL • Swift is multi-tenant • Each object is addressed as a URL • Users name the object • Objects are not organized based on hierarchy • Instead, object names may contain “/”, so pseudo-nested directories are possible.

  18. Swift High-Level Architecture Load Balancing/Authentication Proxy Account / Container / Object Replication and Consistency Standard Hardware

  19. Load Balancing | SSL | Authentication Swift Architecture Proxy • Load Balancing • Requests are load balanced across all nodes running proxy server processes • SSL • Optionally, SSL termination can be enabled to encrypt data in flight • Authentication • OpenStack Swift has a pluggable authentication system • Modules include API/UI-driven, LDAP, AD, Keystone Account | Container | Object Replication | Consistency Standard Servers with Disks

  20. Load Balancing | SSL | Authentication Swift Architecture • Proxy • Only part of the cluster that “talks” to external clients • Primarily with HTTP RESTful Swift API • Routes requests from clients to disk • Three replicas are simultaneous written • Quorum required • Users single replica for reads • Routes around failures • Enforces ACLs set by user Account | Container | Object Proxy Replication | Consistency Standard Servers with Disks

  21. Load Balancing | SSL | Authentication Swift Architecture • Account / Container • Accounts keep records of containers • Containers keep records of objects • Object • The object servers store the data on disk • Metadata is stored with the data • Uses standard filesystem (XFS) Proxy Replication | Consistency Account | Container | Object Standard Servers with Disks

  22. Load Balancing | SSL | Authentication Swift Architecture • Replication • Constantly checking for replicas status • Only updates other replica sites, does not pull in newer versions of objects • Consistency • Constantly ‘scrubbing’ data to check for bad data • Note: These processes run in the background on Nodes where account, container, or object server processes are running. Proxy Account | Container | Object Replication | Consistency Standard Servers with Disks

  23. Load Balancing | SSL | Authentication Swift Architecture Proxy • Standard Servers • Runs on standard server hardware • SATA or SAS disks • No RAID • Visibility to hardware beneath • Swift is already providing data redundancy • Reduce cost Account | Container | Object Replication | Consistency Standard Servers with Disks

  24. Hardware General Purpose • HP SL 4540 4U 60-bay (3 server nodes) • SuperMicro STX-CL XE36-2460 36-bay • Cisco UCS C3260 62-bay (2 server nodes) Backup & Archive (dense) • HP SL 4540 4U 60-bay (1 server nodes) • Cisco UCS C3160 62-bays (1 server node) • Dell DSS7000 4U 90-bay (2 server nodes) • SuperMicro CSE-946ED-R2kJBOD 90-Bay (enclosure) • Seagate 5U 84-bay (enclosure)

  25. Load Balancing | SSL | Authentication Swift Architecture Summary Proxy • Native HTTP API • Scales Linearly • No Single-point of Failure • Standard Servers and Linux • Extremely Durable • Resilient to hardware failures • Routes around network failures • Consistency Model Enables Multi-Data Center Account | Container | Object Replication | Consistency Standard Servers with Disks

  26. Multi-Region and Global Clusters • Storage in every corner of the world with a single pane of control

  27. Regions and Zones Regions Physically separate, often defined by geographical boundaries Minimum: 1 Region Zones Regions contain one or more zones Designate a group of nodes sharing a set of physical hardware Also referred to as failure domains Multi-Region Clusters Tolerate failures across regions and zones Policies control placement across regions Send requests to “closest” region Region 1 Region 2 Load Balancer Load Balancer Global DNS Proxy Node Proxy Node Proxy Node Proxy Node Replication Network Object Nodes Object Nodes Object Nodes Object Nodes Object Nodes Object Nodes Account and Container Nodes Account and Container Nodes Object Nodes Object Nodes Account and Container Nodes Account and Container Nodes Zone 1 Zone 1 Zone 2 Zone 2

  28. Data Placement: “As Unique as Possible” • Large Cluster • Single Node Cluster • Storage Racks are “as-unique-as-possible” • Disks are “as-unique-as-possible” • Multi-Region • Small Cluster • Storage Nodes are “as-unique-as-possible” • Distributed data centers are “as-unique-as-possible”

  29. Storage Policies Benefits Optimizes storage for applications and users Consolidates storage tiers under one system Simplifies management Lowers storage infrastructure TCO Policies Encompass: Storage media Number of replicas Erasure Codes / Replicas Common Policy Groupings: According to performance According to geography or political boundary According to data protection scheme, e.g., number of replicas or erasure coding • Region1 Policy • Region 2 Policy Zone 1 Zone 1 Zone 2 Zone 2

  30. MRC: Write Affinity Proxy Write Affinity • By default, objects are written to all the locations simultaneously. • Write Affinity: writes all copies locally then transfer asynchronously to other regions. Region 2 Region 3 Region 1

  31. MRC: Write Affinity Proxy Write Affinity • By default, objects are written to all the locations simultaneously. • Write Affinity: writes all copies locally then transfer asynchronously to other regions. Region 2 Region 3 Region 1

  32. MRC: Read Affinity Proxy Read Affinity • Prioritizes “Nearby” • Zone/Region, Where am I? • Latency to Storage Node • DNS Routes User • Each proxy pool has it’s own hostname • Routes user to closest region Region 1 Region 2 Global DNS

  33. Cool, What can I do with it? • What are some situation I really should be using Swift?

  34. Use Case – Backup Backup • Enterprise: Workstations • MSP: Cloud backup service Behavior • Write optimized • High throughput • Low concurrency Configuration • PACO nodes (Replicas or Erasure Code) • High density • Seagate 5U 84-bay enclosure • Dell DSS7000 4U 90-bay server • SuperMicro SC946ED 90-bay server Partners • Netbackup • CommVault

  35. Use Case – Big Data / MapReduce Hadoop / Spark Philosophy: Let HDFS do what it’s best at: Serving data where you want it when you need it. SwiftStack for warm and cold data storage. Why? • SwiftStackdurability and reliability guarantees • SwiftStack managed capacity: Grow into the petabytes and beyond • Easier integration with various data input sources • Share results using a common storage platform How? • Run MapReduce job • Read input data from SwiftStack • Write transient results to HDFS • Write result to SwiftStack • Hadoop-OpenStack (formerly SwiftFS): https://hadoop.apache.org/docs/current/hadoop-openstack/

  36. Use Case – Media & Entertainment Media encoding / ingest • Streaming data translation • Middleware: Storlets (IBM) • https://github.com/openstack/storlets Content Streaming • Default behavior in SwiftStack Large objects • Divide media files into 15 or 30 second time periods • Static Large Object • Add custom content / commercials

  37. Use Case – Life Science Content repository High throughput • Many laboratory instruments writing at once • Gene Sequencers, mass spectrometers, etc. • Up to 1GB/min per instrument High capacity (starting at 1 PB+) Durability • Auditor ensures data remains consistent / not corrupted Availability • CDN-like data distribution Auditing • Object Versioning and tagging

  38. Workshop:OpenStack Swift • Let’s try this out

  39. LAB: Installing SwiftStack • https://files.swiftstack.com/workshoplabmanual.pdf • https://files.swiftstack.com/workshopfullmanual.pdf

  40. Using Swift • I got it installed, let’s use it

  41. SwiftStack Web Console In order to do simple operations, and list containers and objects in a graphical representation, SwiftStack provides the Web Console.

  42. SwiftStack Web Console In the current version of the Web Console you can perform the following actions: • Connect to other accounts • Create containers • Upload objects • Download objects • Copy objects • Move objects • Delete objects

  43. 3rd Party Applications

  44. Other Interesting Sessions Ancestry.comin Production with OpenStack and KubernetesWednesday, April 27, 11:00am-11:40am Jordan Nielsen https://www.openstack.org/summit/austin-2016/summit-schedule/events/8826 Doubling performance in Swift with no code changesWednesday, April 27, 2:40pm-3:20pm David Stewart • John Dickinson https://www.openstack.org/summit/austin-2016/summit-schedule/events/8312 Monitoring Swift ++ (inclNagios, Elasticsearch, Zabbix, & more)Thursday, April 28, 9:50am-10:30am Martin Lanner • Adam Takvam https://www.openstack.org/summit/austin-2016/summit-schedule/events/8818 Swift 102: Beyond CRUD - More real demosThursday, April 28, 11:50am-12:30pm Doug Soltesz • Clay Gerrard https://www.openstack.org/summit/austin-2016/summit-schedule/events/8792

  45. Other Interesting Sessions How Burton Snowboards is carving down the OpenStack trailThursday, April 28, 1:30pm-2:10pm Mario Blandini • Jim Merritt https://www.openstack.org/summit/austin-2016/summit-schedule/events/8841 FILE this: Swift API, then S3 API, and now POSIX access to OpenStack SwiftThursday, April 28, 2:20pm-3:00pm Joe Arnold • John Dickinson https://www.openstack.org/summit/austin-2016/summit-schedule/events/8704

  46. I want to know more… Detailed overview and guides: http://swiftstack.com/openstack-swift/ Swift All-In-One Step-by-step Deployment Guide http://docs.openstack.org/developer/swift/development_saio.html API docs:http://api.openstack.org/api-ref-objectstorage-v1.html For contributors:#openstack-swift on freenode IRC SwiftStack Free Trial https://swiftstack.com/try-it-now/ OpenStack Swift: Using, Administering, and Developing for Swift Object Storage (ISBN-13: 978-1491900826 ) By Joe Arnold Book Signing April 26 • 10:45am - 11:15am – A25 April 27 • 01:15pm - 01:45pm – A25 April 28 • 12:45pm - 01:15pm – A25

  47. Thank You! Albert Chen achen@swiftstack.com Eric Rife erife@swiftstack.com

More Related