data storage and raid today l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Data Storage and RAID Today PowerPoint Presentation
Download Presentation
Data Storage and RAID Today

Loading in 2 Seconds...

play fullscreen
1 / 38

Data Storage and RAID Today - PowerPoint PPT Presentation


  • 142 Views
  • Uploaded on

Data Storage and RAID Today. Brandon Krakowsky Jeffrey Doto. Presentation Topics:. Who Relies on Data Storage? Why is data storage so important? Sarbanes-Oxley and HIPAA. Hard Disk Failure. What is a RAID? Different types of RAID and their uses. Enterprise vs. Consumer Storage.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Data Storage and RAID Today' - tavon


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
data storage and raid today

Data Storage and RAID Today

Brandon Krakowsky

Jeffrey Doto

presentation topics
Presentation Topics:
  • Who Relies on Data Storage?
    • Why is data storage so important?
    • Sarbanes-Oxley and HIPAA.
    • Hard Disk Failure.
    • What is a RAID?
  • Different types of RAID and their uses.
  • Enterprise vs. Consumer Storage.
  • Demonstration.
information overload
Information Overload
  • What the heck is an exabyte?
    • 1 billion gigabytes
  • The world generated 161 exabytes of digital information last year
  • IDC estimates that this will grow to 988 exabytes in 2010
    • almost 1 zettabyte!
  • 185 exabytes of storage available last year
  • IDC estimates that this will grow to 601 exabytes in 2010
  • We need more storage!
proliferation of the internet
Proliferation of the Internet
  • How many web pages are there?
    • If you “Google” anything, you’ll get at least a billion choices
  • Web pages used to be just text and graphics
  • Now, audio & video clips are prevalent
  • Hosting companies need to deal with data storage on a whole new level
on demand audio video
On-Demand Audio & Video
  • What about companies who specialize in On-Demand audio/video delivery?
    • YouTube
    • Google Video
  • They make it so easy to upload content
  • How do these companies deal with managing all of this data?
digital audio
Digital Audio
  • Remember Napster?
  • Who buys CDs anymore?
  • What about companies who provide downloadable audio content?
    • iTunes
    • Rhapsody
    • MP3.com
  • Most of these companies provide video as well!
  • Also, Podcasting & Vodcasting are becoming more popular
photographic content
Photographic Content
  • Everybody is a photographer these days!
    • Camera Phone
    • Digital Camera
  • Hosting companies allow users to upload photos easily
    • Flickr
    • Photobucket
  • Where are all these photos stored?
database driven applications
Database Driven Applications
  • Database driven websites rely heavily on data integrity
  • Companies like Amazon, eBay, and Citizens Bank all have huge backbones
    • They rely on storage!
  • National Security Agency has a database of phone call records of “tens of millions” Americans
  • Blogs & Wikis
    • Is this data backed up?
    • Can you imagine if you lost your MySpace account?
e mail
E-Mail
  • Most popular mode of communication
  • When you send a message, where does it go?
  • If you’re like most, e-mail is a lifeline
  • For large companies, email backup is a must!
sarbanes oxley and hipaa
Sarbanes-Oxley and HIPAA
  • George W. Bush signed into office in 2002 in the wake of the Enron scandal.
  • Changed the way publicly-held businesses were responsible for data retention.
  • Enormously profitable for storage industry.
financial impact of sox
Financial Impact of SOX
  • Estimated annual compliance spending up to $17-28.8 billion
  • Great for storage industry
  • Data retention:
  • Net Effect: Double the length retention and number of copies = a lot more storage!
  • Source: The Economist March 4th, 2004, Information Week, March 2, 2006
sox a boon and a burden
SOX: A Boon and A Burden.
  • While it has been a great source of financial gain for storage and IT vendors, it has been a huge headache for CIOs and IT staff.
  • Estimated man hours: countless.
  • New York Times: dedicated 200 employees in 2003, 105 full time on compliance project.
  • Washington Post: Spent $5 Million on outside consultants, created 10 full time positions.
  • CISA: Certified Information Systems Auditor
  • Source: Information week, March 2, 2006; Business Matters, March 2005.
hipaa human insurance portability and accountability act
HIPAA: Human Insurance Portability and Accountability Act
  • Signed into office in 1996.
  • Desired effect was to promote EDI, or Electronic Data Interchange among various healthcare bodies.
  • Protect Patient Privacy
  • Protect Security of Patient Information
data management for the user
Data Management for the User
  • As a user, why do I care?
  • Where do you store all of that music you illegally downloaded?
  • Again, sites like YouTube and Flickr allow you to upload your own media content
    • Where do you store all of your home-grown movies?
    • How do you backup your photo library?
  • Hard drives fill up fast!
microprocessor technological advances
Microprocessor Technological Advances
  • As microprocessor technology improves, so does memory size
  • How does this benefit overall computer performance?
    • It doesn’t unless secondary storage progresses at the same rate
  • Increased microprocessor speed opens the door to newer processor-intensive applications
    • Users need more space
magnetic disk technology
Magnetic Disk Technology
  • MTTF: Mean Time to Failure.
  • Not a question of “will it fail, but when will it fail”.
  • Current drives run at speeds from 5,400 to 15,500 RPMs.
  • Electromechanical parts: spindle motor, actuator arm both prone to failure; magnetism can wear out.
  • Discuss enterprise vs. consumer storage later.
slide17
RAID
  • “Redundant Array of Independent Disks”
    • First proposed in the paper, “A Case for Redundant Arrays of Inexpensive Disks”, published in 1988
  • Method of combining several disk drives into one “Logic Unit Number” (LUN)
    • Appears as a single storage unit to the host system
2 most important features
2 Most Important Features
  • Reliability
    • RAID makes use of “redundancy”
      • Data is redundantly distributed over all or some of the disks providing fault tolerance and data protection
  • Performance
    • Disk performance is enhanced because multiple disks are working in parallel
raid level 0
RAID Level 0
  • No Redundancy
  • Uses a technique called “striping”
    • Data is broken down into blocks
    • Each block is written to a separate disk
  • Provides excellent write performance
    • Data is spread out
  • No data protection
    • If one disk fails, they all fail
raid level 1
RAID Level 1
  • Uses a technique called “mirroring”
    • All data is written to at least two separate disks
  • If one disk fails, there’s a copy
  • Provides 100% data protection
  • Write performance is compensated
    • All data is written twice
  • Read performance is better than RAID 0
    • Data can be read from multiple disks at once
raid level 2
RAID Level 2
  • Uses a technique similar to “striping”
    • Words are split at the bit level
    • Each bit is written to a separate disk
  • Hamming codes are generated for each word
    • Spread across separate Error Correcting Code disks
    • Data is cross-referenced with codes to insure data integrity
  • Write performance is compensated since Hamming codes need to be calculated each time
  • No commercial implementation
    • Too expensive
raid level 3
RAID Level 3
  • Uses a technique called “bit-parity interleaving”
    • Words are split at the bit level
    • Each bit is written to a separate disk
  • Parity bits are generated for each word
    • Stored on a separate parity disk
  • Read and write performance is compensated since all the disks are used for every operation
raid 4 block interleaved parity
RAID 4: Block Interleaved Parity
  • Writes data in blocks instead of bits.
  • Advantage: high read performance.
  • Disadvantages: Dedicated Parity Drive causes severe write bottleneck, requires complex hardware controller.
  • Requires 3 disks to implement.
raid 5 block interleaved distributed parity
RAID 5: Block Interleaved Distributed Parity
  • Solves RAID 4 bottleneck.
  • Parity distributed over all drives. Allows multiple read / writes which increases efficiency.
  • Advantage: most versatile overall; file, web, database, internet servers all can use.
  • Disadvantage: requires a complex controller.
  • Requires at least 3 disks to implement.
raid level 6 block interleaved striping with dual error protection
RAID Level 6: Block interleaved Striping with Dual Error Protection
  • Advantage: Implements both Parity (P) Reed-Solomon Codes (Q) to protect against multiple drive loss.
  • Can think of as an extension to RAID 5.
  • Disadvantage: requires more complex controller with high overhead; requires N+2 disks.
hybrid raid x y vs y x
Hybrid RAID: X+Y vs. Y+X
  • RAID 0+1 :
    • Mirror Striped Set: minimum of 4 drives = $
    • Good for imaging / general file server / an area where highest reliability not a concern.
  • RAID 1+0:
    • Striped Mirror Set: minimum of 4 drives = $
    • Good for databases.
  • RAID 5+1:
    • Mirrored RAID 5 for the truly paranoid.
raid z
RAID Z
  • Uses 128-Bit ZFS file system from Sun’s Solaris OS 10
  • Available on OSX Leopard
  • Advantage: OS calculates parity, no need for external controller, can correct mistakes impossible to correct in RAID 5.
  • Disadvantage: Could take a performance hit if storage close to full.
enterprise vs consumer storage
Enterprise vs. Consumer Storage
  • Enterprise quality storage requires much more engineering
  • Environment plays a big role:
  • Chassis vibration, humidity, volatile solvents, heat, constant use…
slide37

Enterprise vs. Consumer Storage

Seagate Barracuda

7200 RPM 250 GB SATA II Drive

$75.00

Seagate Cheetah

15,500 RPM 147 GB SCSI Drive

$1,100

SATA Connector

80 Pin SCSI Cable

demonstration
Demonstration
  • Old Sun Software based RAID unit.
    • Employs Fibre-Channel Connection.
    • Houses 22 SCSI disks.

Hard Drive Demonstration.

See arm move over disk while writing large file.