Trec 2006 legal track planning session
Download
1 / 29

TREC-2006 Legal Track Planning Session - PowerPoint PPT Presentation


  • 123 Views
  • Uploaded on

TREC-2006 Legal Track Planning Session. Jason Baron Dave Lewis Doug Oard. Welcome to TREC-MOOT. Representing the plaintiff (“Benzo” Pyrene): Jason Baron, J.D. Representing the defendant (Phillip Norris, Inc.): David D. Lewis, Ph.D. Complaint Production Request Query negotiation.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' TREC-2006 Legal Track Planning Session' - lani


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Trec 2006 legal track planning session

TREC-2006Legal Track Planning Session

Jason Baron

Dave Lewis

Doug Oard

TREC 2005


Welcome to trec moot
Welcome to TREC-MOOT

Representing the plaintiff (“Benzo” Pyrene):

  • Jason Baron, J.D.

    Representing the defendant (Phillip Norris, Inc.):

  • David D. Lewis, Ph.D.

    • Complaint

    • Production Request

    • Query negotiation


Complaint
Complaint

INTRODUCTION

1. Benjamin A. Pyrene, on behalf of a class of individuals injured In childhood and adulthood by the effects of second hand smoke, brings this action to enjoin defendant tobacco companies from continuing to make false and misleading statements regarding the health consequences of second hand smoke, in violation of the Commonwealth of TREC’s Fraud Statute, as well as provisions of the tobacco Master Settlement Agreement (“MSA”), and the Consent Decree and Final Judgment (“Consent Decree”) entered into by the Commonwealth of TREC and approved by the Court on October 12, 2005. Plaintiffs request a finding of contempt for violation of the Consent Decree and imposition of monetary sanctions, civil penalties, and the costs, including investigating and pursuing this action.

PARTIES

2. Plaintiff Benjamin (“Benzo”) A. Pyrene brings this action on behalf of a nationwide class of individuals injured in childhood and adulthood by defendants’ actions. Mr. Pyrene resides at 12 Combustible Way, Commonwealth of TREC.

3. Defendant Philip Norris Inc.is a Commonwealth of TREC corporation with its principal place of business in the city of Kendall, County of Tau, Commonwealth of TREC. Other corporations are as identified in the attachments to this Complaint.

JURISDICTION

4. This Court has jurisdiction pursuant to 1 Comm. Trec Sec. 1956, the Consent Decree, and under the MSA, section VII(a).

BACKGROUND

5. According to information and belief , second-hand smoke ranks third as a major preventable cause of death behind only active smoking and alcohol. Second-hand smoke is the smoke that individuals breathe when they are located in the same air space as smokers. Second-hand smoke is a mixture of exhaled …


TOBACCO COMPANIES’ ACTIONS

12. Philip Norris has made numerous representations since the filing of the MSA and Consent Decree regarding the lack of danger from secondhand smoke. A complete listing of these misrepresentations, provided in a table at imaginary Attachment 1, showing time and place of each misrepresentation as well as a summary of the content of the misrepresentation.

COUNT I (Consumer Fraud – Deception)

13. Defendants have engaged in a pattern or practice of deceptive acts or practices in violation of the above-referenced statutes, by making false or misleading representations about the reduced health risks associated with second hand smoking,

COUNT II (MSA)

14. Defendants actions in misrepresenting the effects of second hand smoke violate the MSA at section III(r), because they are material misrepresentations of fact regarding the health consequences of using a tobacco product.

COUNT III (Consent Decree)

15. Defendants actions in misrepresenting the effects of second hand smoke violate the Consent Decree,, because they are material misrepresentations of fact regarding the health consequences of using a tobacco product.

RELIEF REQUESTED

Wherefore, Plaintiffs request that this Court enter the following relief:

Declare Philip Norris et al. violated the MSA and Consent Decree by making statements that are false regarding the effects of second hand smoke, which in turn have created a substantial risk of harm to consumers.

Permanently enjoin defendants, their officers, agents, servants, employees and attorneys, and those persons in active concert of participation with them who receive actual notice of the injunction, from representing in any manner, expressly or implicitly, directly or indirectly, in connection with the manufacturing, advertising, packaging, labeling, promotion offering for sale, sale, or distribution of cigarettes or any other tobacco product for which it does not possess competent and reliable scientific information sufficient to support such representation, that exposure to second hand smoke is perfectly safe.

Enter an order imposing monetary sanctions and a Civil Contempt Order for violations of the Consent Decree and MSA.

Impose a civil penalty of $1,000 for each violation of State law.

Defendants to pay costs and expenses, including attorneys’ fees, in connection with the investigation and litigation of this matter. …


Trec moot
TREC-MOOT

Representing the plaintiff (“Benzo” Pyrene):

  • Jason Baron, J.D.

    Representing the defendant (Phillip Norris, Inc.):

  • David D. Lewis, Ph.D.

    • Complaint

    • Production Request

    • Query negotiation


Production request
Production Request

  • BENZO A. PYRENE v. PHILIP NORRIS – REQUESTS TO PRODUCE PROPOUNDED BY PLAINTIFFS PURSUANT TO FED. R. Civ. P. 34

  • 1. All documents referencing scientific research on the effects of second hand smoking

  • 2. All documents that expressly link second hand smoke to being a medical health hazard.

  • 3. All documents that discuss one of the following topics plus expressly reference second hand smoking:

  • Sidestream smoke

  • platelet activation,

  • abnormalities of vasodilation.

  • Injury to the arterial lining,

  • atherosclerosis,

  • benzo(a)pyrene

  • butadiene.

  • 4. All documents showing that senior level management at Philip Norris were aware of the dangers of second hand smoking.

  • 5. All documents referencing asthma in children.

  • 6. All documents referencing smoke free ordinances governing public places. …


Welcome to trec moot1
Welcome to TREC-MOOT

Representing the plaintiff (Mr. “Benzo” Pyrene):

  • Jason Baron, J.D.

    Representing the defendant (Phillip Norris, Inc.):

  • David D. Lewis, Ph.D.

    • Complaint

    • Production Request

    • Query negotiation


Shifting the rules of the game
Shifting the Rules of the Game

  • Classic IR

    • Goal: satisfy a visceral information need

    • Understanding of need evolves during search

    • Personal view of relevance

  • E-Discovery

    • Goal: identify a set of responsive documents

    • Negotiated information need

    • Agreed / defensible / explainable process


Key stakeholders
Key Stakeholders

  • E-Discovery participants (Sedona Conference)

    • Judges

    • Law firms

    • Regulatory agencies

    • Technology providers

  • IR research teams (TREC)

    • Negotiated information needs

    • Different genre (document images, metadata, …)


Primary source search use cases in legal applications
Primary-Source SearchUse Cases in Legal Applications

  • Two-party [Legal Track focus]

    • “Discovery”

      • Negotiate relevance definition, search process

      • Contract lawyers identify relevant documents

      • Partners review relevant docs for “privilege”

    • Regulatory / Oversight investigation

    • Freedom of Information Act (FOIA)

  • One-party [not our focus]

    • Risk assessment


Process requirements in order of decreasing importance
Process Requirements(in order of decreasing importance)

  • Two-party

    • Negotiated (not personal) information needs

  • Recall-oriented

    • “Smoking gun detection”

  • Explainable

    • Quantifiable comparison to present best practice

  • Affordable

    • Minimize amount of human review


Current e discovery process
Current E-Discovery Process

Query

Formulation

Query

Boolean

Retrieval

Result Set

Review

Indexing

Responsive

Documents

Index

Acquisition

Collection

Delivery


Possible e discovery process
Possible E-Discovery process

Source

Selection

IR System

Query

Formulation

Query

Ranked

Retrieval

Ranked List

Selection

Result

Set

Indexing

Index

Responsive

Documents

Incremental

Review

Acquisition

Collection

Delivery


Collection options
Collection Options

  • Tobacco (IIT/UCSF) [Discovery]

    • 3-7 million scanned documents, diverse genre

      • Good OCR available for >1 million documents

    • Expert judges and assessment system exist

  • Enron (FERC) [Regulatory Oversight]

    • ~100,000 emails, attachments, phone transcripts

    • Sample topics exist, judgment will be hard

  • State Department (National Archives) [FOIA]

    • ~500,000 “cables” (messages)


Iit ucsf tobacco collection
IIT/UCSF Tobacco Collection

  • 7 million scanned documents

    • Distributed in “standard” TREC XML format

    • Probably on a few DVD’s

  • Some form of OCR for half the collection

    • Possibly from two systems

  • Metadata fields for the full collection

    • People, date, source, …

    • 7 company-specific DTD’s

  • Goal is to search the full collection

    • OCR-subset results will also be reported


Iit ucsf metadata example
IIT/UCSF Metadata Example

DOCID:

Bates number:

Date:

Author:

Mentioned:

Recipient:

Title:

<A ID="BVW63A00">

<br t="p">504110499-0499</br>

<YR>19000000</YR>

<L>CARLSON TN; LRD</L>

<rn t="m">MINNESOTA 1RFP128;

MINNESOTA COURT ORDER;

US COMPREHENSIVE REQUEST 343;

US COMPREHENSIVE REQUEST 175;

US COMPREHENSIVE REQUEST 179;

MISSOURI COURT ORDER 19980814</rn>

<r>COLBY FG</r>

<K>CORRESPONDENCE REFLECTING RESULTS OF LITERATURE SEARCH PREPARED BY LRD EMPLOYEE ENGAGED TO ASSIST ATTORNEYS AND TRANSMITTED TO RJR SCIENTIST WORKING ON BEHALF OF THE LEGAL DEPARTMENT.</K>

<st>R PRIV:WP;JD</st>

<dm>20031215</dm>

</A>


Academic researcher s topic
Academic Researcher’s Topic

Title: Firesafe Cigarettes

Relevance Criteria: Relevant documents provide information on 1) firesafe cigarettes, or 2) tobacco industry responses to legislation and interest around firesafe cigarettes.

Relevance Judges and/or Sources of Relevant Documents: [4 names]

Keywords: firesafe, firesafe cigarettes, reduced flammability, self-extinguishing, fire safety education, accidental fires, fire prevention, furniture flammability, reduced ignition

Key Attributes: Most relevant documents will have been created between 1980 and the present.


Some issues
Some Issues

  • Evaluation measures

    • N-recall (recall at N that Boolean query found)

  • Topic Generation

    • Modeling a representative process

    • Question typology

  • Minimizing the cost of entry

    • Format as superset of standard TREC style

  • Outreach to E-Discovery technology groups


Legal track topic generation
Legal Track Topic Generation

  • Develop 30 topics for 2006 (losing a few)

  • Start with a complaint for simulated lawsuit

    • Requestor defines specific information needs

  • 2 lawyers negotiate Boolean query

    • Responding party can do preliminary searches

    • Result set size defines ranked list depth

  • Same lawyers negotiate Ranked query

    • Optionally, providing more information


Possible query fields
Possible Query Fields

  • Title / Description / Narrative

  • Negotiated Boolean

  • Boolean “limit-by” suggestion

  • “Rank by” cues

    • Metadata

    • Free text


Focus conditions
Focus Conditions

  • Ranked retrieval from TD (“required”?)

  • Ranked retrieval from everything

  • Ranked retrieval from Boolean only


Strawman schedule
Strawman Schedule

Jan 1 Commit to a collection

Mar 1 Guidelines ready

Apr 1 Collection release

Jul 1 Topic release

Aug 1 Runs submitted by sites

Aug 5 Pools ready for judgment

Sep 20 Judgments completed

Oct 1 Results release

Mid-Nov TREC-2006


Questions for participants
Questions for Participants

  • Who wants to participate?

  • Is Aug 1 submission OK?

    • Would very early data release help?

  • Do we want tasks other than doc retrieval?

    • Social networks?

  • Should we create a robust subtrack?


Questions for nist
Questions for NIST

  • Can NIST help with:

    • Distributing the collection?

    • Building specialized scoring scripts?

    • Accepting runs and creating judgment pools?

    • Scoring official runs?

  • Who will be our primary NIST contact?

  • Do our dates match your constraints?


Next steps
Next Steps

  • Mailing list



Incremental ranked review

Limit to: Marlboro NOT “Upper Marlboro”

Rank by: tobacco, {policy staff}

Limit to: Marlboro NOT “Upper Marlboro”

Limit to: Marlboro NOT “Upper Marlboro”

Rank by: regulation, manipulation

Incremental Ranked Review


<!--DTD for rjr--><!--"Document". This element contains all of the meta-data about a given document. It has an "ID" attribute used to uniquely identify a given document. The values of this attribute are refered to as TIDs (Tobacco IDs). They are generally of the format "AAAddAdd" which "A" stands for any letter and "d" sands for any digit. Sometimes due to old errors a TID of the form "AAAdAAdd" will occur. These values are allocated and assigned to documents in the Relational Database.--><!ELEMENT A (br, r?, L?, pb?, b?, c?, rn?, YR, dt?, co?, m?, br?, dp, p?, re?, ag?, rn, sc?, sh?, s?, te?, K?, DS, PV)><!ATTLIST A ID CDATA #REQUIRED><!--"Bates Range". The source data contains two bates range fields: "Document ID" (which is abbreviated here as "num"), and "Other Number" (abbreviated as "onm). This element has a type attribute "t" used to indicate which field the element represents. A value of "p" (for "primary") means that it represents "Document ID". A value of "o" (for "other") means that it represents "Other Number".--><!ELEMENT br (#PCDATA)><!ATTLIST br t CDATA><!--"Recipient" here abbreviated as "add"--><!ELEMENT r (#PCDATA)><!--"Author" here abbreviated as "aut"--><!ELEMENT L (#PCDATA)><!--"Production Box" here abbreviated as "box"--><!ELEMENT pb (#PCDATA)><!--"Brand" here called "brd"--><!ELEMENT b (#PCDATA)><!--"Copied" here abbreviated as "cpy"--><!ELEMENT c (#PCDATA)><!--"Request Number". The source data contains two request number fields: "Request Number" (abbreviated as "req") and "Possible Minnestoa Requests" (abbreviated as "crs"). The element has a type attribute "t" used to indicate which of these two fields it represents. A value of "p" indicates that it represent the first. A value of "m" indicates that it represents the second.--><!ELEMENT rn (#PCDATA)><!ATTLIST rn t CDATA><!--"Document Date" here abbreviated as "docdt"--><!ELEMENT YR (#PCDATA)><!--"Document Type" here abbreviated as "dtp"--><!ELEMENT dt (#PCDATA)><!--"Characteristics" here abbreviated as "mar" for "marginalia"--><!ELEMENT co (#PCDATA)><!--"Mentioned" here abbreviated as "men"--><!ELEMENT m (#PCDATA)><!--"Date Produced" here abbreviated as "dpt"--><!ELEMENT dp (#PCDATA)><!--"Page Count" here abbreviated as "pglen"--><!ELEMENT p (#PCDATA)><!--"Redacted Information" here abbreviated as "red"--><!ELEMENT re (#PCDATA)><1--"Attachment Group" here abbreviated as "ref" for "reference document"--><!ELEMENT ag (#PCDATA)><!--"Special Collections" here abbreviated as "scoll"--><!ELEMENT sc (#PCDATA)><!--"Date Shipped" here abbreviated as "ship"--><!ELEMENT sh (#PCDATA)><!--"Source" here abbreviated as "src"--><!ELEMENT s (#PCDATA)><!--"Trial Exhibit" here abbreviated as "texh"--><!ELEMENT te (#PCDATA)><!--"Title" here abbreviated as "ttl"--><!ELEMENT K (#PCDATA)><!--"Data Source". This is single letter code indicating that this document is from the American Tobacco set.--><!ELEMENT DS (#PCDATA)><!--"Provenence". This element contains a two letter abbreviation for the data set, followed by a space, followed by a the TID of the current record.--><!ELEMENT PV (#PCDATA)>


ad