the data deluge n.
Skip this Video
Loading SlideShow in 5 Seconds..
The Data Deluge PowerPoint Presentation
Download Presentation
The Data Deluge

Loading in 2 Seconds...

play fullscreen
1 / 19

The Data Deluge - PowerPoint PPT Presentation

  • Uploaded on

The Data Deluge. “ The Growth of Unstructured Data ” Dr Kevin McIsaac, IBRS Overview. The Impact of Changes in Data Growth Rates Exploiting Data Management Technologies Taking Control Of E-mail Conclusions. The Impact of Changes in Data Growth Rates.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'The Data Deluge' - deo

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
the data deluge

The Data Deluge

“The Growth of Unstructured Data ”

Dr Kevin McIsaac, IBRS


The Impact of Changes in Data Growth Rates

Exploiting Data Management Technologies

Taking Control Of E-mail


the impact of changes in data growth rates
The Impact of Changes in Data Growth Rates

Data growth rates accelerate

The “unstructured data” tipping point

How big is the impact?

data growth rates accelerate
Data Growth Rates Accelerate

92% of all new data is stored on magnetic media, primarily hard disks.

That data grew about 30% pa between 1999 and 2002

Growth rate forecast to grow at 60% pa though 2011!

i.e., your storage capacity will double every 18 months!

2007: First 1TB disk!

Source: Computer World/IBRS Data Management Survey

So What’s New! Data Has Always Grown At High Rates.

the unstructured data tipping point
The “Unstructured Data” Tipping Point

What is “Unstructured Data”

We have reached a tipping point were

More that ½ of all data managed by IT is unstructured

Merrill Lynch estimate 85% of business data is unstructured

Some of your largest data sets are unstructured, e.g., e-mail

Unstructured data growth rate of 65%-200%

But, 38% of ITO’s lack a document management system

Source: Computer World/IBRS Data Management Survey

Data Management Was Traditionally About Managing Structured Data. This Focus Needs to Change.

how big is the impact
How Big Is The Impact?

Source: Computer World/IBRS Data Management Survey

IT Must Learn To Manage Unstructured Data As Effectively As It Does Structured Data Today

  • Office workers spend an average of 9.5 hr/wk searching, gathering and analysing information, with 60 % of that on the Internet
    • Outsell
  • White collar workers spend 30% - 40% of their time managing documents
    • Gartner
  • Our survey highlights
    • Strong concerns with the rate of unstructured data growth
    • Lack of systems to manage this
    • Few concerns with the storage infrastructure.
exploiting data management technologies
Exploiting Data Management Technologies

Advances in Storage Hardware

Commoditisation of Storage Arrays

Information Lifecycle Management

Document Management

Data Classification

Disaster Recovery Readiness

advances in storage hardware
Advances in Storage Hardware

Shugart’s Law - $ per bit of magnetic storage declines 1/2 every 18 months

~37% pa (10%/Q), recently 50% pa!

Flat budget supports ~60%pa growth

SANs well established & a commodity

Fully featured arrays reasonably priced

iSCSI taking off as a complement to FC

Bolt-on storage virtualisation not gaining traction

Content Addressable Storage

Use for long term archive.

TCO benefits are in the long term management of data

Shugart’s Law Ensures Drive Costs Are Contained, But What About The System Costs

commodity storage arrays
Commodity Storage Arrays

G1: Monolithic arrays

Proprietary & very expensive

G2: Modular arrays

Proprietary with commodity components, moderately expensive

G3: Commodity based arrays

Commodity components, standards based, inexpensive

SAS as high performance, lower cost alterative to FC-disk

Freely mix SAS and SATA in same frame

In-box virtualisation for simpler management and lower cost

Thin provisioning is the next big virtualisation technology

Potential for new vendor to challenge established players

e.g., Compellent, EqualLogic, 3-PAR etc

Hardware Is Just A Small Part Of The Problem. Data Management Processes Are More Important

information lifecycle management
Information Lifecycle Management

Source: Computer World/IBRS Data Management Survey

While ILM is The holy grail of storage vendors it has not yet been widely adopted

  • Automate the management of your data lifecycle policy
    • Retain, delete, migrate, archive
    • Defining and enforcing policy
      • Who sets policy? Who has authority?
      • IT is not the data owner, just the steward!
    • Start with tiered storage
      • Balance price with service levels
  • Due to high growth rates focus on unstructured data
    • Transactional stuff generally Ok
    • Archival of E-mail and Documents
  • Don’t confuse backup & archival!
    • Separate archive from backup
document management
Document Management

Document management can eliminate significant wasted time

“White collar workers spend 30% - 40% of their time managing documents”

But, 38% have no DM system and 50% only cover some documents

Document management needs to include e-mail

E-mail is often the largest unstructured data repository

But only12% said document management includes e-mail

Source: Computer World/IBRS Data Management Survey

Document Management and ILM and Archiving Are All Predicated on Data Classification and Policy

data classification policy
Data Classification & Policy

Source: Computer World/IBRS Data Management Survey

  • Only 12% had clear, formal policy. Without this:
    • IT can’t act responsibly as a steward
      • No mandate!
    • ILM is nearly impossible, i.e.,
      • Data can’t be deleted and archival is difficult.
  • Few had metadata or taxonomies, which hampers data use and reuse

Businesses Need to Invest in Data Classification & Policy

disaster recovery readiness
Disaster Recovery Readiness

Disaster recovery confidence level are high, however…

44% said they have not tested their DR plan in the last 12 months.

35% said they had only one a limited disaster recovery test in the last 12 months.

Source: Computer World/IBRS Data Management Survey

Without Regular Testing Disaster Recovery Plans Are A Lottery

taking control of e mail
Taking Control Of E-mail

The Importance of E-mail

E-mail Data Management Challenges

Managing Users’ Mailboxes

the importance of e mail
The Importance of E-mail

80% say e-mail is more important than the telephone. 74 % said being without e-mail is a greater hardship than losing the telephone.

META Group

A typical business user sends and receives around 600 e-mail per week

Ferris Research

The average office worker spends 49 min/day managing e-mail. Upper level managers spend up to 4hrs/day.  All that sending & receiving, responding & deleting takes an enormous toll on workplace productivity.

ePolicy Institute

E-mail Is An Essential Business Tool But E-Mail Data Management Is Still A “Cottage Industry”

e mail data management challenges
E-mail Data Management Challenges

57% Said Managing E-mail Was One Of Their Top DM Problems

Top Exchange DM challenges

Managing Exchange disaster recovery

Managing the size of Message Stores

Protecting & searching individual .PST files

Restoring individual mailboxes

Responding to legal discovery and capturing all email for compliance

Osterman Research

Source: Computer World/IBRS Data Management Survey

Managing Users’ Mailboxes Is Key To All These Challenges

managing users mailboxes
Managing Users’ Mailboxes

The common solution is to use mailbox quotas

40% use PSTs to limit growth but 37% said it caused problems.

Just shift the problem elsewhere

E-mail archival can be a powerful solution but…

Only 13 % had successfully implemented e-mail archiving

Another 13% tried and failed!

Needs robust data management policy

Only 2% implemented an e-discovery/compliance solution!

Source: Computer World/IBRS Data Management Survey

Getting E-mail Under Control Is An Important And Urgent Issue, But Proceed With Great Caution


We have reached a tipping point, where unstructured data volume and growth exceeds that of structured data

Learn to manage unstructured data as effectively as structured data

Invest in data classification & policy before applying technology

the data deluge1

The Data Deluge

“The Growth of Unstructured Data ”

Dr Kevin McIsaac, IBRS