Email archiving
This presentation is the property of its rightful owner.
Sponsored Links
1 / 21

Email Archiving PowerPoint PPT Presentation


  • 60 Views
  • Uploaded on
  • Presentation posted in: General

Email Archiving. Arvind Srinivasan Gaurav Baone. Imagine this is what happens to your business records at the end of every month …. SEC 17a-4. FDA 21 CFR 11. NASD 3010, 3110. DoD 5015.2. HIPAA. Sarbanes-Oxley. If this looks absurd …. That’s exactly what we do to email!.

Download Presentation

Email Archiving

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Email archiving

Email Archiving

Arvind Srinivasan

Gaurav Baone


Email archiving

Imagine this is what happens

to your business records

at the end of every month ….


Email archiving

SEC 17a-4

FDA 21 CFR 11

NASD 3010, 3110

DoD 5015.2

HIPAA

Sarbanes-Oxley

If this looks absurd …

That’s exactly what we do to email!

Practically every major transaction, project, and contract, is recorded in email

Regulators now treat email like hard copy records

And the courts agree (FRCP, Dec 2006)

Non-compliance fines and legal liabilities are rising . . .

ZipLip, Inc.


Just how much scalability does archiving require

Just How Much Scalability Does Archiving Require?

25,000

Employees averaging 70 mails/day

7

Years Retention

4.47

Billion Emails For Archive System To Index & Search

4.28

Billion Web-Pages Indexed by

source: Google Press Release, Feb 17, 2004

Assume:

versus

Functionality needs to scale to these volumes


Outline

Outline

  • Email Capture Methods

  • Business Drivers

  • Archive Functionality

  • Retention & Deletion

  • Surveillance & Compliance

  • E Discovery

  • Conclusion


Email capture methods

Email Capture Methods

  • Active Capture Methods – PRO-ACTIVE Archiving

    • Journaling

    • Mailbox crawling

    • SMTP Gateway Capture

  • Historical Capture Methods – REACTIVE Archiving

    • Restore from backup tapes

    • Crawl for PST / NSF files from desktops

    • Forensic captures


Journaling 100 capture

Journaling – 100% Capture


Mailbox crawling policy based

Mailbox Crawling – Policy Based


Reactive archiving

Reactive Archiving


Not just email

Not Just Email


Primary business drivers regulations and laws

Primary Business Drivers - Regulations and Laws

SEC 17a-4

NASD 3010

Gramm-Leach-Bliley Act

HIPAA

Hedge Funds Rule 203(b)

Basel II

CA SB1386

Sarbanes-Oxley Act

Mutual Funds Rule 38a-1

NASD 3011

Investment Advisors Act

UK Freedom of Information Act

US Freedom of Information Act

Canada PIPEDA

Florida Sunshine Law

FRCP

Japan Personal Information Protection Act

DoD5015.2


Functional requirements

Functional Requirements

  • Retention

  • Surveillance and Compliance

  • e Discovery

    • Common Theme - Classification


Retention deletion

Retention & Deletion

Conflicting Requirements:

  • Laws & Regulation => Retain for “x” years.

  • Vs

  • Company Liability/Risk and Cost

  • Real-time Categorization of Mail

  • Sender/Recipients

  • Content (Subject, body, attachment)

  • User Input (Which folder it was found, Manual Tagging)


Retention deletion cont d

Retention & Deletion (cont’d)

  • "a priori" and "a posteriori“ based Retention.

  • Event Driven – Deletion of mail from user folder, Reclassification of mail by end user

  • Legal Hold – Court Orders to retain evidence relating to certain subject matters.

  • Single Instance Storage

  • Same Email in Multiple Mailboxes

  • Same Attachment in Multiple Emails

  • Significant storage savings.


Surveillance

Surveillance

Conflicting Requirements:

  • Regulation require review of documents

  • Vs

  • Effort spent into reviewing the documents.

  • Real-time Flagging of Mail

  • Lexical Based – Key words, word associations, wild-cards

  • Policy Based – Eg. Mail from WallStreetJournal.com is newsletter.

  • Custom Code – Detect Vacation Response, Read Receipts, DSN’s


Surveillance cont d

Surveillance(cont’d)

  • Real-time Flagging is a categorization problem

  • Current Systems suffer from lot of false positive.

  • Transparent and Deterministic rules preferred over Blackboxes.

  • Disclaimers (Internal and External) tend to get flagged as it contains the very terms that we try to flag.

  • Use Reviewer feedback to adapt the rules.


E discovery

E-Discovery

Conflicting Requirements:

  • Produce electronic docs. to satisfy court-orders

  • Vs.

  • Providing insufficient, not relevant, privileged Information

  • Discovery Request

  • Certain number of custodians

  • Date Range

  • Pertaining to certain subject matter; usually described by a set of Search terms.

┼ Source: Williams v. Taser Int’l, Inc., 2007 WL 1630875 (N.D. Ga. June 4, 2007)


E discovery cont d

E-Discovery(cont’d)

  • Landmark case Zubulake vs. UBS Warburg (2003)

  • Primarily driven by Federal Rules of Civil Procedure (FRCP) established in 2006.

  • Litigants are entitled to obtain electronic information from the adverse party.

  • Voluntary Initial Disclosures need to be made pertaining to each litigant

  • Today, almost all cases have some sort of electronic documents as evidence.


E discovery cont d1

E-Discovery(cont’d)

  • Parties face Sanctions if they do not provide all the relevant documents.(Numerous precedence, eg. Metrokane vs Built NY 2008). Validation occurs when receiving party can prove existence of other document through hard-copy printout or other means.

  • Lawyers from both parties routinely negotiate keywords to define Search Concepts

  • Manual Review of Documents for Relevance and Privilege. Numerous product cluster similar documents (near deduplication) to present similar documents to reviewers to improve efficiency.

  • Chain of Custody – To prove that the document has not be tampered or altered.


Palin s e mail at 15m per request

Palin’s e-mail at $15m per request

  • NBC's price quote for e-mails sent to Todd Palin: $15 million.

  • AP's price quote for e-mails between state employees and the campaign headquarters of Sen. John McCain: $15 million.

  • AP's price quote for e-mails between state employees and the National Park Service: $15 million.


Conclusion

Conclusion

  • Most challenges in archiving can be reduced to Classification problem.

  • Segmentation Problems: Detect internal and external disclaimers

  • Detect change in Email behavior through email profile analysis

  • Understanding mails: Need to develop Analysis techniques to understand the contents

  • Visualization and Grouping Similar mails – Control the order in which mails and documents are viewed.

  • Consistent way of defining Subject Matters – Beyond just a set of keywords.

  • Extract more meta data about attachments such as images, audio and video files.

  • And all the above are required in muliple languages – English, Japanese, Spanish, Chinese, and others.


  • Login