processing and analyzing electronic data n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Processing and Analyzing Electronic Data PowerPoint Presentation
Download Presentation
Processing and Analyzing Electronic Data

Loading in 2 Seconds...

play fullscreen
1 / 68

Processing and Analyzing Electronic Data - PowerPoint PPT Presentation


  • 113 Views
  • Uploaded on

Processing and Analyzing Electronic Data. Arizona Paralegal Association Phoenix, September 12, 2006. Cliff Shnier, JD Director, Business Development Cataphora Inc Scottsdale, AZ 480-661-6183 cliff@cataphora.com. The New Rules: they’re He-e-e-e-ere!.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Processing and Analyzing Electronic Data' - glain


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
processing and analyzing electronic data

Processing and Analyzing Electronic Data

Arizona Paralegal Association

Phoenix, September 12, 2006

Cliff Shnier, JD

Director, Business Development

Cataphora Inc

Scottsdale, AZ

480-661-6183

cliff@cataphora.com

the new rules they re he e e e ere
The New Rules: they’re He-e-e-e-ere!
  • The Supreme Court approved the changes and transmitted them to Congress on April 12, 2006.
  • All that’s needed is the enabling legislation.
  • These rule changes affect Rules 16, 26, 33, 34, 37, 45 and Form 35.
frcp 26 a amended
FRCP 26(a) [amended]

Rule 26. General Provisions Governing Discovery; Duty of Disclosure

  • REQUIRED DISCLOSURES…

a party must, without awaiting a discovery request, provide a copy of all documents, electronically stored information, and tangible things… in its control that it may use to support its claim or defense.

  • This is as far as many countries go!
  • Quebec Code of Civil Procedure, Art. 331.1 and 402 – 403;
  • French Code of Civil Procedure, Article 753.
english translation of art 753 of french code of civil procedure
English Translation of Art. 753 of French Code of Civil Procedure:
  • Pleadings shall set out expressly the claims of the parties as well as the issues of law and fact which are the basis of each claim. A memorandum listing the documents in support of these claims shall be annexed to the pleadings.
    • And that’s all you have to produce, mon ami.
frcp 26 b
FRCP 26(b)
  • (b) DISCOVERY SCOPE AND LIMITS. … the scope of discovery is as follows:
  • (1) In General. Parties may obtain discovery regarding any matter, not privileged,...relevant to the claim or defense of any party, including… any books, documents...

This is much more than 26(a), and

This is where the U.S. goes much further than most other jurisdictions.

the view from over there
The view from “over there”
  • “In the United Kingdom, extensive American-style discovery is viewed a cultural anomaly and a wasteful extravagance. Computer-based discovery is viewed as particularly obtrusive.”
    • Ken Withers, address at the University of Edinburgh, April 2001.
exponentially greater volume
“exponentially greater volume”

From page 22 of the Commentary by the Rules Committee, Sept 2005

electronic data volumes
1 MB = roughly 75 pages

1 GB = roughly 75,000 pages

Therefore, 30 Gb = 2.25M pgs,

= 1000 boxes

= 250 lineal feet of five-tier shelves

= 50 file cabinets.

“On a single ten square inch hard drive, more data can be stored than would fit on the entire floor of a building.”

Arkfeld, Electronic Discovery and Evidence, p 1-9, quoting a 1999 article by Kimberly Richard in 21Whittier L.Rev. 463

Electronic Data Volumes
  • QUIZ: A company with 10,000 employees generates 2.5 million e-mail messages per:
  • ___ Year? ____ Month? ___ Week?
we had pcs since the early 80 s
We had PCs since the early 80’s
  • So why didn’t e-Discovery show up until the mid -1990’s?
the answer of course is
The answer of course, is…
  • Connectivity -- the Internet!
  • Until mid-90’s, computers were just tools to create paper documents.
  • Then very quickly, business switched to written communication without paper.
    • e-mail replaced paper (and fax).
    • By 2000, paper “a superfluous by-product.”
    • e-mail even replaced the telephone.

“Many informal messages that were previously relayed by telephone or at the water cooler are now sent via email.” Byers v Illinois State Police (N.D. Ill, 2002)

the explosion of electronic data
The explosion of electronic data

3.400 trillion

  • Over 95% of corporate documents are now electronic
  • Email has become indispensable
  • All electronic documents are discoverable
  • No more “I won’t ask if you won’t ask”. They’re asking.

U.S. Corporate E-mail Volume Growth

Trillions

. . .

Source: Wall Street Journal, January 10, 2000; IDC

the path not taken
The path not taken…
  • The committee might easily have decided that broad scope was no longer tenable.
  • Instead, they mostly preserved modern US-style broad discovery
  • and recognized that technology, the source of the problem, is also the source of the solution …
frcp 26 f meet and confer
FRCP 26(f): “Meet and Confer”

Rule 26. General Provisions Governing Discovery; Duty of Disclosure

  • (f) to discuss any issues relating to preserving discoverable information…and to develop a proposed discovery plan concerning…
    • (3) discovery of electronically stored information including the form/s in which it should be produced…
    • (4) relating to privilege or protection as trial-preparation including asserting such claims after [inadvertent] production
slide14

Collect

Physical collection (or delivery) of documents

Organize

Photocopy or Scan, Bates number, track documents in boxes or, by 90’s, code into database

Review

Evaluate for production

Decide relevance

Decide privilege

Produce

Ultimate physical delivery of documents; receiving from other side

The Stages of Discovery when it was Paper

1

2

3

4

except with electronic data there s also an earlier step preservation

Collect

How to find/ copy/compile responsive ESI?

Preserve

Ensure the electronic data you need is kept intact

Organize

How to process ESI (and its much greater volume) so you can review it and utilize it?

Review

How to review ESI (and its much greater volume?)

Except with Electronic Data, there’s also an earlier step -- Preservation

0

1

2

3

the stages of discovery the challenges when the information is electronic

Collect

How to find/ copy/compile responsive EDD?

Organize

How to process EDD (and its much greater volume) so you can review it and utilize it?

Review

How to review EDD (and its much greater volume?)

Produce

What is the best method for producing EDD? (and how would you like to receive it?)

The Stages of Discovery: the challenges when the information is Electronic

1

2

3

4

the stages of discovery moving on from step zero preservation to step 1 collection

Collect

How to find/ copy/compile responsive EDD?

Organize

How to process EDD (and its much greater volume) so you can review it and utilize it?

Review

How to review EDD (and its much greater volume?)

Produce

What is the best method for producing EDD?

The Stages of Discovery: Moving on from “Step Zero”, Preservation, to “Step 1”, Collection

1

2

3

4

slide20

After you’ve collected the electronic data…

  • “…remember, that’s all you’ve got at that point. A whole lot of messy electronic data.”

William Cwiklo, Panelist on Electronic Data Discovery, Glasser LegalWorks, Fairmont Hotel, San Francisco, February 1999.

discovery stage 2 organize that electronic data meaning process it somehow to make it useable

Collect

How to find/ copy/compile responsive EDD?

Organize

How to process E-Data (and its much greater volume) so you can review it and use it?

Review

How to review EDD (and its much greater volume?)

Produce

What is the best method for producing EDD?

Discovery Stage 2: Organize that Electronic Data – meaning Process it somehow to make it useable

1

2

3

4

slide22

The Options for Processing Electronic Data (1-2)

  • 1. Print Everything: Print out the entire collection (from native app) and review paper for relevancy.
  • 2. Print->Scan->Code: The “1997” model

“In the shift to a new medium, the content reflects the previous medium.” -- Marshall McLuhan

Example: the first ten years of television were visual radio. (Acknowledgment to Michelle Ostrom of Attenex.)http://www.mcluhan.utoronto.ca/mcluhanprojekt/allen2.htm

1997 processing print scan code electronic to paper to electronic
1997 processing: Print-Scan-Code; Electronic to Paper  to Electronic

Paralegal/Word Processing

Print out all files

Paper

Scanner

Results

Responsive review

(OCR)

Coder

Litigation Database

Production

slide24

The Options for Processing Electronic Data (3)Why “process” electronic data at all?

  • 1. Print Everything: Print out the entire collection (from native app) and review paper for relevancy.
  • 2. Print->Scan->Code: The “1997” model
  • 3. “Do Nothing”: Review each custodian’s files in their Native format, and using the Native application software itself.
  • So what’s wrong with “doing nothing”?
the no process do nothing approach using outlook to review outlook
The No-Process “Do nothing” approach: Using Outlook to review Outlook
  • No tagging, No annotating
  • No Redacting
  • Merely moving the data to another machine changes its appearance.
using outlook to review outlook advanced find
Using Outlook to review Outlook:“Advanced Find”
  • Slow
  • Limited search flexibility
  • Responses are simply a listing of e-mails – can’t format reports
  • Will NOT search attachments
the options for processing electronic data 4
The Options for Processing Electronic Data (4)
  • 1. Print Everything: Print out the entire collection (from native app) and review paper for relevancy.
  • 2. Print->Scan->Code: The “1997” model
  • 3. “Do Nothing”: Review each custodian’s files in their Native format, and using the Native application software itself.
  • 4. Convert (‘process’) electronic data to another electronic form better suited to reviewing: Then review entire collection either with in-house litigation support software or on-line through an ASP Repository
processing electronic data conversion to tiff in the late 1990 s
Processing Electronic Data – Conversion to TIFF in the late 1990’s
  • Conversion of e-mails and e-docs to:
    • a TIFF image, linked to
    • indexed bibliographic information;
    • with full text;
    • and maintains parent/attachment relation.
    • A faster, cheaper way to convert e-data to the model we had gotten used to with paper – a database record linked to a scanned image.
by 1999 processing that electronic data meant converting it to tiff

Collect

How to find/ copy/compile responsive EDD?

Process

How to process E-Data so you can review it and use it? The answer for a while was Convert to TIFF

Review

How to review EDD (and its much greater volume?)

Produce

What is the best method for producing EDD?

By 1999, processing that Electronic Data meant Converting it to TIFF

1

3

4

2

but the volume kept growing
But the volume kept growing!

3.400 trillion

U.S. Corporate E-mail Volume Growth

Trillions

. . .

Source: Wall Street Journal, January 10, 2000; IDC

sedona conference search and information retrieval principle 1
Sedona ConferenceSearch and Information Retrieval, Principle 1:

In litigation… where the volume of discoverable electronically stored information is large,

it may not be feasible to perform human review of every document for responsiveness or privilege,

and automated search and information retrieval methods and tools may be necessary and valuable.

This isn’t just a brainstorm

of words and phrases.

courts now expect automated processes to identify responsive data
Courts now expect automated processes to identify responsive data
  • “A responding party may satisfy its good faith obligation to preserve and produce potentially responsive electronic data and documents by using electronic tools and processes, such as data sampling, searching, or the use of selection criteria, to identify data most likely to contain responsive information.” (emphasis added)
    • Zakre v. Norddeutsche Landesbank Girozentrale, 2004 WL 764895 (S.D.N.Y. Apr. 9, 2004) adopting Sedona Principle 11 verbatim.
automated tools in e discovery
Automated tools in e-discovery
  • De-duplication
  • Keywords and Boolean
  • Statistical Clustering
  • Natural Language and fuzzy searching
  • Concept search tools
  • Taxonomies and Ontologies
search engine software
Attenex

Autonomy

Cataphora

Dolphin Search

Engenium

Guidance

Stratify

Syngence

“Search Engine” Software
approaches to data organization

RelationshipAnalysis

RelationshipAnalysis

documents withcausal or sequential relationship

documents withcausal or sequential relationship

Social Network Analysis

Social Network Analysis

Social Network Analysis

relationships among relevant people

relationships among relevant people

relationships among relevant people

Ontology

Ontology

Ontology

Ontology

Clustering

Clustering

Clustering

Clustering

generalized words or phrases

generalized words or phrases

similarity of salient features

similarity of salient features

generalized words or phrases

generalized words or phrases

similarity of salient features

similarity of salient features

Keyword

Keyword

Keyword

Keyword

Keyword

specific exact words

specific exact words

specific exact words

specific exact words

specific exact words

Approaches to Data Organization

Context

Concept

Content

a simple ontology
A Simple Ontology
  • ROYALTY CONCEPT
    • Royalty
    • Commission
    • Honorarium
    • Usage Fee
    • Slice of the Pie
a more realistic ontology
A More Realistic Ontology
  • charge for use
  • charged for use
  • charging for use
  • charges for use
  • licence fee
  • license fee
  • lisense fee
  • “take cut”~2
  • “takes cut”~2
  • “took cut”~2
  • “slice pie”~5
  • “piece pie”~5
  • “piece action”~5
  • “slice action”~5
  • -king
  • -queen
  • -prince
  • -princess
  • ROYALTY CONCEPT
    • royalty
    • royalties
    • rty
    • commission
    • commissions
    • comm.
    • honorarium
    • honorariums
    • honoraria
    • usage fee
    • usage charge
    • usg fee
    • use fee
    • fee for use
    • fee for usage
    • incent*
    • insent*
reviewing the right data
Reviewing the Right Data

Duplicates 25%

Intake Data 100%

Non-Responsive (NR) Junk 20%

(Spam/Jokes/etc.)

NR Business 20%

NR Personal 20%

Privileged 3%

Relevant & Responsive 12%

Estimates: These figures vary based upon the data set received

getting to responsive data keywords versus ontologies
Getting to Responsive DataKeywords versus Ontologies

Reviewable 1.575

“Responsive” to Keywords0.842

FinalOntology PassResponsive0.109

All numbers in millions of items

yet for all that breadth keywords still miss vital documents
Yet for all that breadth, keywords still miss vital documents!

8,553 responsive documentsmissed by keyword search

(Almost 8% of responsivedocuments missed bykeyword search)

cost and time savings
Cost and Time Savings
  • Cost to review “keyword” docs: $2,526,000
  • Cost to process, create ontologies and review docs found by them: $1,621,076
  • Net cost savings: $904,924
  • Keyword review time: Over 11 weeks
  • Ontology time: 6 weeks or less including both review and processing time
the end product of processing now
The end-product of Processing now
  • Less data
  • Standardized so each e-mail, each attachment, each free-standing electronic file will have:
    • A “database record” linked to
    • The data itself in its native format
    • and/or with other renderings, a TIFF or PDF image.
  • So no longer is it “a whole lot of messy electronic data”
slide45

The Evolution of Electronic Discovery Processing

Print and

Review

1995 AD

Print,

Scan,

Code,

Review

1997

TIFF

And

Review

Circa

1999

Simple

Filtering

2001

Analytical

Defensible

Reliable

Reduction,

then Review,

2004-

Keyword

Searching

2002

discovery stage 3 review

Collection

How to find/ copy/compile responsive EDD?

Organization

How to process EDD so you can review and utilize?

Review

How to review EDD (and its substantially greater volume?)

Production

What is the best method for producing EDD?

Discovery Stage 3: Review

1

2

3

4

but i only trust humans looking at every document it s tried and true
Full review is rarely as accurate as automated searching.

Humans make errors, get distracted, bored and tired.

Typical human error rate is 25%

And expense of human review of every document in dollars and time is prohibitive.

“But I only trust humans looking at every document -- it’s tried and true”
slide48

No manual review of millions

of documents is cost-effective or accurate

  • After culling by whatever means, you’ve still got quite a lot.
  • Use computing power to enhance review
    • Grouping data, multiple document decisions at once
    • Workflow / QA can accelerate and improve quality
slide51

Why Context Is Important

  • In a hardcopy document, a prior sentence or paragraph provides the context
  • In an e-mail, SMS/Text message, or IM, the previous or nextmessage may provide context
  • Today, a whole case could turn on…
    • Let’s Do it!
    • OK. Go Ahead!
    • Sure
    • G2G, SLAP, WIIFM
reviewing without context
Reviewing without Context

Is this document:

  • Privileged
  • Non-Responsive
  • Relevant?
  • Incriminating?

Can’t really tell?

What “matter”?

“It”?

What’s “touchy?

context across documents

Our bond offering has a cash shortfall. What shall we do?

Let’s issue more bonds to cover the shortfall.

Great idea. Let’s go ahead with it!

That’s illegal. Don’t even think about it.

Context Across Documents
context provides meaning
Context Provides Meaning
  • Let’s focus on the last two documents.
  • Note that the word “bond” is not used.
  • Nevertheless, these documents contain important evidence.

Great idea. Let’s go ahead with it!

That’s illegal. Don’t even think about it.

the solution

Our bond offering has a cash shortfall. What shall we do?

Let’s issue more bonds to cover the shortfall

Great idea. Let’s go ahead with it!

That’s illegal. Don’t even think about it.

The Solution
  • Review documents as a causally-related group, not in isolation.
discovery stage 4 production

Collection

How to find/ copy/compile responsive EDD?

Organization

How to process EDD so you can review and utilize?

Review

How to review EDD (and its substantially greater volume?)

Production

What is the best method for producing EDD?

Discovery Stage 4: Production

1

2

3

4

frcp 34 amended continued
FRCP 34 [amended, continued]
  • Rule 34. Production of Documents, Electronically Stored Information, and Things and Entry Upon Land for Inspection and Other Purposes.

(b) PROCEDURE. … the request may specify the form in which the electronically stored information is to be produced….

…[the responding party may object] to the requested form, stating the reasons, and the form it intends to use [instead].

Let the Games Begin!

possible production formats
Possible Production Formats
  • Paper (if the other side asks for it this way, be happy to oblige. It is the least useful format in which to receive a production.)
  • Paper-like (TIFF or PDF images)
    • TIFF images without any searchable data at all are increasingly unacceptable.
  • Native Files
  • Hosted “production” areas of the producing party’s web repository.
don t forget you re in it to win it
Don’t forget, you’re in it to win it
  • After production, you still have to work with all your data and everything the other side has produced to you;
  • Prepare for depositions, brief witnesses, prepare for trial, investigate and analyze
  • Any database allows searching, sorting, and basic reporting
  • – yawn, that’s so 1987.
slide68

Cliff Shnier, JD

Director, Business Development

Cataphora Inc

Scottsdale, AZ

480-661-6183

cliff@cataphora.com