Vorlesung Speichernetzwerke

Vorlesung Speichernetzwerke Dipl. – Ing. (BA) Ingo Fuchs 2003

Agenda I/II • Storage Infrastructure • Client / Server, LAN • Storage DAS, SAN • Storage NAS, iSAN • Backup / Restore • Disaster Recovery • Virtualization • Storage Management

Agenda II/II • Business Cases • Examples for SAN, NAS, iSAN • TCO / ROI • Analysts View of the storage market • Storage Billing, SLAs • Storage as a utility • Project Model for Storage

Agenda Storage Infrastructure I/II • Client / Server Infrastructures • LAN Basics • Server Components • SCSI • RAID • SAN Basics, FC • FC Infrastructure Components (Hubs, Switches, Router) • SAN Zoning, LUN Masking, Replication • NAS Basics, IP • NAS Protocols (CIFS, NFS) • iSAN (iSCSI)

Agenda Storage Infrastructure II/II • Backup / Restore Basics • Backup Hardware • Backup Software • Backup Methods (Full, Differential, Incremental) • Archiving • Disaster Recovery • Storage Virtualization • In-Band / Out-of-Band Virtualization Methods • Storage Infrastructure Management

Agenda Business Cases • Example Scenarios for SAN, NAS, iSAN • TCO / ROI Basics • TCO / ROI Scenarios • Analysts Views on the storage marketplace • Gartner Magic Quadrant • Service Level Agreements • Storage Resource Billing • A new view on Storage – Storage as a Utility

Agenda Project Model for Storage • 7 steps model • Etc.

Storage Infrastructure

Client / Server • Client / Server Infrastructure Basics

LAN Basics • Switches, Hubs, Routers • LAN Infrastructure • TCP/IP Basics, OSI Layers • NIC´s, TCP/IP Offload NIC´s • SMB/CIFS, NFS

Server Components • Von-Neumann Architecture • Bus Architecture (PCI etc.) • Internal / external Disks

SCSI • NFS vs. SCSI

RAID Basics • In 1987, Patterson, Gibson and Katz at the University of California Berkeley, published a paper entitled "A Case for Redundant Arrays of Inexpensive Disks (RAID)" . This paper described various types of disk arrays, referred to by the acronym RAID. The basic idea of RAID was to combine multiple small, inexpensive disk drives into an array of disk drives which yields performance exceeding that of a Single Large Expensive Drive (SLED). Additionally, this array of drives appears to the computer as a single logical storage unit or drive. • In the meantime, control of RAID standards moved to the RAID Advisory Board (RAB) and RAID was renamed to “Redundant Array of Independent Disks”. • Array = multiple physical disks treated as one logical disk. • Striping = spreading data over multiple disk drives.

RAID - Striping • Fundamental to RAID is "striping", a method of concatenating multiple drives into one logical storage unit. Striping involves partitioning each drive's storage space into stripes which may be as small as one sector (512 bytes) or as large as several megabytes. These stripes are then interleaved round-robin, so that the combined space is composed alternately of stripes from each drive. In effect, the storage space of the drives is shuffled like a deck of cards. The type of application environment, I/O or data intensive, determines whether large or small stripes should be used. • Most multi-user operating systems today, like NT, Unix and Netware, support overlapped disk I/O operations across multiple drives. However, in order to maximize throughput for the disk subsystem, the I/O load must be balanced across all the drives so that each drive can be kept busy as much as possible. In a multiple drive system without striping, the disk I/O load is never perfectly balanced. Some drives will contain data files which are frequently accessed and some drives will only rarely be accessed. In I/O intensive environments, performance is optimized by striping the drives in the array with stripes large enough so that each record potentially falls entirely within one stripe. This ensures that the data and I/O will be evenly distributed across the array, allowing each drive to work on a different I/O operation, and thus maximize the number of simultaneous I/O operations which can be performed by the array. • In data intensive environments and single-user systems which access large records, small stripes (typically one 512-byte sector in length) can be used so that each record will span across all the drives in the array, each drive storing part of the data from the record. This causes long record accesses to be performed faster, since the data transfer occurs in parallel on multiple drives. Unfortunately, small stripes rule out multiple overlapped I/O operations, since each I/O will typically involve all drives. However, operating systems like DOS which does not allow overlapped disk I/O, will not be negatively impacted. Applications such as on-demand video/audio, medical imaging and data acquisition, which utilize long record accesses, will achieve optimum performance with small stripe arrays. • A potential drawback to using small stripes is that synchronized spindle drives are required in order to keep performance from being degraded when short records are accessed. Without synchronized spindles, each drive in the array will be at different random rotational positions. Since an I/O cannot be completed until every drive has accessed its part of the record, the drive which takes the longest will determine when the I/O completes. The more drives in the array, the more the average access time for the array approaches the worst case single-drive access time. Synchronized spindles assure that every drive in the array reaches its data at the same time. The access time of the array will thus be equal to the average access time of a single drive rather than approaching the worst case access time.

RAID Levels • RAID 0 is the fastest and most efficient array type but offers no data redundancy or fault-tolerance. • RAID 1 is the array technique of choice for performance-critical, fault-tolerant environments and is the main choice for fault-tolerance if no more than two drives are available. • RAID 3 is a popular choice for data-intensive or single-user applications that access long sequential records. However, it does not typically allow multiple I/O operations to be overlapped. • RAID 4 offers no practical advantages over RAID 5 and does not typically support multiple simultaneous write operations – unless used in Netapp environments • RAID 5 is generally the best choice for multi-user environments that are not particularly sensitive towrite-performance. At least three, and typically five or more drives are required to build a RAID 5 array.

RAID 0 – Data Striping • Capacity: N (N = number of disks) • Performance: Best - significant performance advantage over a single disk (because can do N requests simultaneously if you have N disks rather than N requests to 1 disk). • Protection: Poor - if one disk fails, all data is lost and all disks must be reformatted (could restore array from tape however). • Description: data striped (spread) across each disk in array in sector(s) for improved performance. • RAID-0 is typically defined as a non-redundant group of striped disk drives without parity. RAID-0 arrays are usually configured with large stripes for I/O intensive applications, but may be sector-striped with synchronized spindle drives for single-user and data intensive environments which access long sequential records. Since RAID-0 does not provide redundancy, if one drive in the array crashes, the entire array crashes. However, RAID-0 arrays deliver the best performance and data storage efficiency of any array type.

RAID 1 – Data Mirroring or Duplexing • Performance: Good - since there at least two disks, a read request can be met by either disk. Duplexing has each disk attached to a separate controller, so performance may be futher improved. • Protection: Good - either disk can fail and data is still accessible from other disk. With duplexing, a disk controller could fail as well and still have complete protection of data. • Capacity: N/2 • Description: Disk mirroring duplicates data (complete file) from one disk onto a second disk using a single disk controller. Disk duplexing is the same as mirroring except disks are attached to a second disk controller (like two SCSI adapters). • RAID-1, better known as "disk mirroring", is simply a pair of disk drives which store duplicate data, but appears to the computer as a single drive. Striping is not used, although multiple RAID-1 arrays may be striped together to appear as a single larger array consisting of pairs of mirrored drives, typically referred to as "Dual-level array" or RAID 10. Writes must go to both drives in a mirrored pair so that the information on the drives is kept identical. Each individual drive, however, can perform simultaneous read operations. Mirroring thus doubles the read performance of an individual drive and leaves the write performance unchanged. • RAID 1E Array (with an odd number of disks): the first stripe is the data stripe and the second stripe is the mirror (copy) of the first data stripe but shifted one drive.

RAID 3 – Data Striping (Bytes) with Parity Disk • Performance: Good for large transfers only - RAID 3 is generally considered better for transfer of large data blocks such as graphics or imaging files. • Protection: Good - if any disk fails, the data can still be accessed by using the information from the other disks and the parity disk to reconstruct it. • Capacity: N-1 • Description: RAID 3 stripes data, one byte at a time, across all the data drives. Parity information, used to reconstruct missing data, is stored on a dedicated drive. RAID 3 requires at least two data disks, but works best with four disks (and one parity disk). • RAID-3, as with RAID-2, sector-stripes data across groups of drives, but one drive in the group is dedicated to storing parity information. RAID-3 relies on the embedded ECC in each sector for error detection. In the case of a hard drive failure, data recovery is accomplished by calculating the exclusive OR (XOR) of the information recorded on the remaining drives. Records typically span all drives, thereby optimizing data intensive environments. Since each I/O accesses all drives in the array, RAID-3 arrays cannot overlap I/O and thus deliver best performance in single-user, single-tasking environments with long records. Synchronized-spindle drives are required for optimum RAID-3 arrays in order to avoid performance degradation with short records.

RAID 4 – Data Striping (Sectors) with Parity Disk • RAID-4 is identical to RAID-3 except that large stripes are used, so that records can be read from any individual drive in the array (except the parity drive), allowing read operations to be overlapped. However, since all write operations must update the parity drive, they cannot be overlapped. This architecture offers no significant advantages over RAID-5, except scalability of disk drives in dynamic array environments (Network Appliance implementation – WAFL Filesystem).

RAID 5 – Data and Parity Striping • Capacity: N-1 • Description: Raid 5 stripes data, sector(s) at a time, across all disks. Parity is interleaved with data information rather than stored on a dedicated drive. RAID 5 works with a minimum of three disks. • Performance: Good for networks - RAID 5 is preferred for smaller block transfers the size of • typical network files. RAID 5 can degrade throughput of a server about 35% compared to RAID 0. • Protection: Best - if any disk fails, the data can still be accessed by using the info from the other disks along with the striped parity info. • RAID-5, sometimes called a Rotating Parity Array, avoids the write bottleneck caused by the single dedicated parity drive of RAID-4. Like RAID-4, large stripes are used so that multiple I/O operations can be overlapped. However, unlike RAID-4, each drive takes turns storing parity information for a different series of stripes. Since there is no dedicated parity drive, all drives contain data and read operations can be overlapped on every drive in the array. Write operations will typically access a single data drive, plus the parity drive for that record. Since, unlike RAID-4, different records store their parity on different drives, write operations can be overlapped. • RAID-5 offers improved storage efficiency over RAID-1 since parity information is stored, rather than a complete redundant copy of all data. The result is that any number of drives can be combined into a RAID-5 array, with the effective storage capacity of only one drive sacrificed to store the parity information. Therefore, RAID-5 arrays provide greater storage efficiency than RAID-1 arrays. However, this comes at the cost of a corresponding loss in performance. • When data is written to a RAID-5 array, the parity information must be updated. This is accomplished by finding out which data bits were changed by the write operation and then changing the corresponding parity bits.

Hardware vs. Software RAID • To make informed disk array purchase decisions, it is important to look beyond a discussion of the different RAID levels and also understand the differences between software and hardware-based RAID implementations. Software-based arrays are typically the least expensive to implement. Conversely, the true cost of this implementation can increase dramatically if the demand they place on the server CPU necessitates an upgrade to maintain acceptable network performance. • Software-based RAID implementations are either operating system-based, or they are application programs that run on the server. Most operating system RAID implementations provide support for RAID-1, and to a lesser degree, some operating systems also provide support for RAID-5. All array operations and management functions are controlled by the array software running on the host CPU. • Most hardware-based arrays are implemented directly on a host-based RAID adapter and tightly couples the array functions with the disk interface. Additionally, this design allows all of the array operations and management to be off-loaded from the host CPU and instead, be executed locally on an embedded-processor. Different hardware RAID adapters offer some combination of RAID configuration support options including RAID-0, RAID-1, RAID-3, and RAID-5.

Hardware vs. Software RAID - Performance • Just like any other application, software-based arrays occupy host system memory, consume CPU cycles and are operating system dependent. By contending with other applications that are running concurrently for host CPU cycles and memory, software-based arrays degrade overall server performance. Also, unlike hardware-based arrays, the performance of a software-based array is directly dependent on server CPU performance and load. • Except for the array functionality, hardware-based RAID schemes have very little in common with software-based implementations. Since the host CPU can execute user applications while the array adapter's processor simultaneously executes the array functions, the result is true hardware multi-tasking. Hardware arrays also do not occupy any host system memory, nor are they operating system dependent. • Hardware arrays are also highly fault tolerant. Since the array logic is based in hardware, software is NOT required to boot. Some software arrays, however, will fail to boot if the boot drive in the array fails. For example, an array implemented in software can only be functional when the array software has been read from the disks and is memory-resident. What happens if the server can't load the array software because the disk that contains the fault tolerant software has failed? Software-based implementations commonly require a separate boot drive, which is NOT included in the array.

SAN Basics • World Wide Name (WWN) • Logical Unit Number (LUN) • Host Bus Adapter (HBA) • Hub, Switch, Router

PVC Jacket Outer Layer - Data, Parity Twisted Pair Note: Twisted in different directions in each layer to reduce capacitive interference Shield Unterschiede – Kabel SCSI und FC SCSI Cable Fibre Channel Cable • One Transmit • One Receive • Fibre or Copper

FC - Topologien • Point-to-Point (FC-PH) • - Exactly two N_Ports Connected together • - No Fabric Elements present • - No “Fabric” Services available • Switch Topology (FC-FG, FC-SW) • - May use entire 24-bit address space • - Fabric elements may or may not be connectable/cascadable • Arbitrated Loop Topology (FC-AL) • - Private Loop up to 126 NL-Ports • - May attach a Cross-Point Switch topology through an FL_Port

Point to Point Topology Node N_Port Node N_Port

Fabric Switch F_Port E_Port E_Port Fabric Up to 16 million N_Ports in one Fabric Node N_Port F_Port Fabric Switch Node N_Port F_Port N_Port Node F_Port Node N_Port F_Port F_Port N_Port Node

Fabric Switch F_Port Node NL_Port FL_Port Arbitrated Loop Up to 126 NL_Ports on one loop Node NL_Port Node Node NL_Port NL_Port

Audio/Video Channels Networks Virtual Interface FC-4 Protocol Mapping Layer Common Services FC-3 Framing Protocol/Flow Control FC-2 Encode / Decode FC-1 FC-PH Physical Layer FC-0 Fibre Channel Structure

Idles Idles FC Frame FC2 Frame Format S O F E O F Header Data Payload CRC Words 1 4 6 24 0 - 528 0 - 2112 1 4 1 4 Bytes (Transmission Characters)

Domain Area Device FC Adressing Device 127 Addresses 16 Million Addresses FC_AL FL_Port

FC Zoning • No zone sees Loop 2

Workstation Workstation Workstation Workstation Fibre Channel Switched Switch Switch Controller Controller Controller Controller Fibre Channel Loop Switches & Loops – High Availability

L1_1 L1_2 L1_3 L1_5 L1_4 FL1SW3 FL2SW3 L2_1 L2_2 L2_3 L2_4 L3_4 Switch Fabric - Example Node 2N2 Node 5N5 Node 10N10 Node 1N1 Node 9N9 Node 6N6 F1SW1 F2SW1 F1SW2 F2SW2 F1SW3 F2SW3 E2 SW2 E1SW3 FC Switch 1 FC Switch 2 FC Switch 3 E1 SW2 E1SW1 E3 SW2 E2SW3 F4SW1 F3SW1 F4SW2 F3SW2 F4SW3 F3SW3 Node 12N12 Node 11N11 Node 7N7 Node 3N3 Node 4N4 Node 8N8 Z1 6 ports on 3 switch elementsZ2 4 ports on 1 switch elementZ3 4 ports on 2 switch elements Z4 4 N Ports, 2 FL Ports, 10 L Ports on 1 switch elementZ5 5 L Ports, 1 FL Port, 1 HUBZ6 5 L Ports, 1 FL Port, 1 HUB Note1: Any switch could have an FL port with attached Loop.Note2: Zones may also be restricted by Class of Service

DB9/HSSDC Converter FC Connectors - Copper DB-9 Type 1= Transmit + 6= Transmit - 5 = Receive + 9 = Receive - Shell= Cable Shield HSSD Connector

FC Connectors - Fiber

FC Infrastructure Components • More and more, the design and deployment of SAN technology involves incorporating specialized interconnection equipment. This category of devices often includes Fibre Channel Hubs, Switches and Bridges. This hardware is generally responsible for linking together the data storage peripherals, such as RAID systems, tape backup units and servers within a SAN. • These interconnection devices are somewhat analogous to their LAN-related counterparts.They perform functions such as data frame routing, media and interface conversion (i.e. copper to optical, Fibre Channel to SCSI), network expansion, bandwidth enhancement, zoning, and they allow concurrent data traffic. Just as customers today are more involved in the design and implementation of their LANs and WANs, they are also looking at these building blocks of SANs to create their own SAN solutions. • Fibre Channel HBAs, hubs, switches, and FC/SCSI bridges are some of the building block components with which IT administrators can develop SAN-based backup solutions, server clusters, enhanced bandwidth, extended distance and other application driven challenges. Selecting the appropriate pieces to address these issues requires an understanding of what each component can do. When, for example, is a fabric switch a better solution than a hub? When should hubs and switches be used in combination? There are no universal answers to these questions, but understanding the architecture and capabilities of switches, hubs and bridges provides a basis for making appropriate choices for SAN designs.

FC Hubs • Similar in function to Ethernet or Token Ring hubs, an Arbitrated Loop hub is a wiring concentrator. Hubs were engineered in response to problems that arose when Arbitrated Loops were built by simply connecting the transmit lines to the receive lines between multiple • devices. A hand-built daisy chain of transmit/receive links between three or more devices allows for a circular data path or loop to be created, but poses significant problems for troubleshooting and adding or removing devices. In order to add a new device, for example, the entire loop must be brought down as new links are added. If a fiber optic cable breaks or a transceiver fails, all cables and connectors between all devices must be examined to identify the offending link. • Hubs resolve these problems by collapsing the loop topology into a star configuration. Since each device is connected to a central hub, the hub becomes the focal point of adds/moves or changes to the network. Arbitrated Loop hubs provide port bypass circuitry that automatically reconfigures the loop if a device is removed, added or malfunctions. Before a new device is allowed to be inserted into a loop, the hub will, at a minimum, verify and validate its signal quality. Devices with poor signal quality, or an inappropriate clock speed, will be left in bypass mode and allow other devices on the loop to continue operating without disruption. • Hubs typically provide LEDs for each port that provide "at a glance" status of insertion, bypass or bad-link state. These features enable a much more dynamic environment where problems can be more readily identified, particularly since devices can be hot-plugged or removed with no physical layer disruption. • A hub port can be designed to accept either electrical or optical I/O. This capability is very useful useful in designing a network or configuring it. For instance, if it were desirable to locate the hub some distance from the server, an optical connection (long wave or short wave) could be used between the server and hub while copper connections could be used between the hub and local controllers. Hubs can be cascaded to provide additional ports for even more connectivity.

FC Switches • Fibre Channel fabric switches are considerably more complex than loop hubs in both design and functionality. While a hub is simply a wiring concentrator for a shared 100MB/sec segment, a switch provides a high-speed routing engine and 100MB/sec data rates for each and every port. Apart from custom management functions, hubs do not typically participate in Fibre Channel activity at the protocol layer. A fabric switch, by contrast, is a very active participant in Fibre Channel conversations, both for services it provides (fabric log-in, Simple Name Server, etc.) and for overseeing the flow of frames between initiators and targets (buffer-to-buffer credit, fabric loop support, etc.) at each port. Providing fabric services, 100MB/sec. per port performance and the advanced logic required for routing initially kept the per port cost of first generation fabric switches quite high. Second generation, ASIC- (Application Specific Integrated Circuit) based, fabric switches have effectively cut the per port cost by more than half. This brings Fibre Channel fabric switches within reach of medium to large enterprise networks. • Fibre Channel Arbitrated Loops (FC-AL) are serial interfaces that create logical point-to-point connections between ports with the minimum number of transceivers and without a centralized switching function. FC-AL therefore provides a lower cost solution. However, the total bandwidth of a Fibre Channel arbitrated loop is shared by all of the ports on the loop. • Additionally, only a single pair of ports on the loop can communicate at one time, while the other ports on the loop act as repeaters.

FC Bridges / Routers • Fibre Channel to SCSI bridges provide conversion between these two different electrical interfaces and therefore allow IT managers to leverage investments in existing SCSI storage devices, while taking full advantage of the inherent benefits of Fibre Channel technology. • These devices are commonly used to connect Fibre Channel networks to legacy SCSI peripherals, such as tape backup systems.

LUN Masking • Mapping from Servers to Storage Devices (WWN)

SAN Data Replication • Data Replication provides many benefits in today's IT environments. For example, it can enable system administrators to create and manage multiple copies of business-critical information across a global enterprise. This can maximize business continuity, enabling disaster recovery solutions, Internet distribution of file server content, and improve host processing efficiency by moving data sets onto secondary servers for backup operations. These applications of data replication are frequently enhanced by the inherent "high data availability" features provided by today's SAN architectures. • Copying data from one server to another, or from one data storage system to one or more others, may be achieved in a number of ways. Traditionally, organizations have used tape-based technologies to distribute information. However, for many organizations that have built their businesses on an information infrastructure, the demand for instant access to information is increasing. While tape-based disaster recovery and content distribution solutions are robust, they do not support an instant information access model. Many organizations are supplementing, or replacing, their existing disaster recovery and content distribution solutions with online replication. • The replication of information is typically achieved with one of two basic strategies:Storage level replication is focused on the bulk data transfer of the files or blocks under an application from one server to one or more other servers. Storage replication is independent of the applications it is replicating, meaning that multiple applications can be running on a server while they are being replicated to a secondary server.Application level replication is specific to an application such as a database or web server, and is typically done at transaction level (whether a table, row, or field) by the application itself. If multiple applications are being used on the same server, an application-specific replication solution must be used for each individually.

Replication Types • Remote storage replication can be implemented at either the data storage array or host level. Array-based (or hardware) storage replication is typically homogeneous, meaning that data is copied from one disk array to another of the same make and model. A dedicated channel such as ESCON (Enterprise Systems Connection) is commonly required to link the two arrays, and can also require a number of other pieces of proprietary hardware such as storage bridges. Host level storage replication is implemented in software at the CPU level, and is independent of the disk array used. This replication is done using standard protocols such as TCP/IP across an existing network infrastructure such as ATM. • Whether implemented at the array or host level, storage replication works in one of two modes – synchronous or asynchronous. Whichever replication type is chosen has a number of implications:Synchronous replication – In a synchronous replication environment, data must be written to the target before the write operation completes on the host system. This assures the highest possible level of data currency for the target system – at any point in time, it will have the exact same data as the source. However, synchronous replication can introduce performance delays on the source system, particularly if the network connection between the systems is slow. Some solutions combine synchronous and asynchronous operations, switching to asynchronous replication dynamically when there are problems, then reverting to synchronous when the communication problems are resolved.Asynchronous replication – Using asynchronous replication, the source system does not wait for a confirmation from the target systems before proceeding. Products may queue (or cache) data and send batches of changes between the systems during periods of network availability

NAS Basics

NAS Protocols • Common Internet File System (CIFS) • Formerly known as SMB (Server Message Blocks) • Used for Windows environments • Native vs. „copied“ implementation (e.g. Samba) • Network File System (NFS) • Used for Linux/UNIX environments • Internet SCSI (iSCSI) • Newly evolved protocol for block-level access over IP networks

NAS Protocols

iSCSI Structure

iSCSI Protocol

Vorlesung Speichernetzwerke

Vorlesung Speichernetzwerke

Presentation Transcript

Vorlesung Speichernetzwerke Teil 2