Technology-Assisted Review Can be More Effective and More Efficient Than Exhaustive Manual Review Gordon V. Cormack University of Waterloo

Technology-Assisted Review Can be More Effective and More Efficient Than Exhaustive Manual Review Gordon V. Cormack University of Waterloo gvcormac@uwaterloo.ca (519) 888-4567 x34450 Maura R. Grossman Wachtell, Lipton, Rosen & Katz mrgrossman@wlrk.com mrgrossman@wlrk.com (212) 403-1391

2 Watson Versus Jennings and Ritter

3 Debunking the Myth of Manual Review  The Myth:  That “eyeballs-on” review of each and every document in a massive collection of ESI will identify essentially all responsive (or privileged) documents; and  That computers are less reliable than humans in identifying responsive (or privileged) documents.  The Facts:  Humans miss a substantial number of responsive (or privileged) documents;  Computers—aided by humans—find at least as many responsive (or privileged) documents as humans alone; and  Computers—aided by humans—make fewer errors on responsiveness (or privilege) than humans alone, and are far more efficient than humans.

4 Human Assessors Disagree!  Suppose two assessors, A and B, review the same set of documents;  Overlap = # documents coded responsive by both A and B # documents coded responsive by A or B, or both A and B Example: Primary and secondary assessors both code 2,504 documents as responsive. One or both code 2,531 + 2,504 + 463 = 5,498 documents as responsive. Overlap = 2,504 ∕ 5,498 = 45.5%.

5 More Human Assessors Disagree Even More!  Suppose three assessors, A, B, and C, review the same set of documents;  Overlap = # documents coded responsive by A and B and C # documents coded responsive by one or more of A, B, or C Example: Primary, secondary, and tertiary assessors all code 1,972 documents as responsive. One or more code 1,482 + 532 + 224 + 1,972 + 1,049 + 239 + 522 = 6,020 documents as responsive. Overlap = 1,972 / 6,020 = 32.8%.

6 Pairwise Assessor Overlap in the TREC 4 IR Task (Voorhees 2000)

7 Assessor Overlap With the Original Response to a DOJ Second Request (Roitblat et al. 2010)

8 Assessor Overlap: IR Versus Legal Tasks

9 What is the “Truth”? Option #1: Deem Someone Correct Deem the primary reviewer as the gold standard (Voorhees 2000).

10 What is the “Truth”? Option #2: Take the Majority Vote Deem the majority vote as the gold standard.

11 What is the “Truth”? Option #3: Have all Disagreements Adjudicated by a Topic Authority Have a senior attorney adjudicate all but only cases of disagreement (Roitblat et al. 2010; TREC Interactive Task 2009).

12 How Good are Human Eyeballs?  What do we mean by “How Good”?  Recall;  Precision; and  F1.

13 Measures of Information Retrieval  Recall = # of responsive documents retrieved Total # of responsive documents in the entire document collection (“How many of the responsive documents did I find?”)  Precision = # of responsive documents retrieved Total # of documents retrieved (“How much of what I retrieved was junk?”)  F1 = The harmonic mean of Recall and Precision.

14 Recall and Precision

15 10%20%30%40%50%60%70%80%90%100% Recall 70% 80% 90% 100% 0% 10% 20% 30% 40% 50% 60% 0% Precision TREC Best Benchmark (Best performance on Precision at a given Recall) Perfection Typical result in a manual responsiveness review Blair & Maron (1985) The Recall-Precision Trade-Off

16 How Good is Manual Review?

17 Effectiveness of Manual Review

18 How Good is Technology-Assisted Review?

19 What is “Technology-Assisted Review”?

20 Defining “Technology-Assisted Review”  The use of machine learning technologies to categorize an entire collection of documents as responsive or non-responsive, based on human review of only a subset of the document collection. These technologies typically rank the documents from most to least likely to be responsive to a specific information request. This ranking can then be used to “cut” or partition the documents into one or more categories, such as potentially responsive or not, in need of further review or not, etc.  Think of a spam filter that reviews and classifies email into “ham,” “spam,” and “questionable.”

21 Types of Machine Learning  SUPERVISED LEARNING = where a human chooses the document exemplars (“seed set”) to feed to the system and requests that the system rank the remaining documents in the collection according to their similarity to, or difference from, the exemplars (i.e., “find more like this”).  ACTIVE LEARNING = where the system chooses the document exemplars to feed to the human and requests that the human make responsiveness determinations from which the system then learns and applies that learning to the remaining documents in the collection.

22 Source: Servient Inc. http://www.servient.com/ Document Set for Review Machine Learning Step #1: Achieving High Precision

23 Documents Set Excluded From Review Source: Servient Inc. http://www.servient.com/ Machine Learning Step #2: Improving Recall

24 How Do We Evaluate Technology- Assisted Review?

25 The Text REtrieval Conference (“TREC”): Measuring the Effectiveness of Technology- Assisted Review  International, interdisciplinary research project sponsored by the National Institute of Standards and Technology (NIST), which is part of the U.S. Department of Commerce.  Designed to promote research into the science of information retrieval.  First TREC conference was held in 1992; the TREC Legal Track began in 2006.  Designed to evaluate the effectiveness of search technologies in the context of e- discovery.  Employs hypothetical complaints and requests for production drafted by members of The Sedona Conference®.  For the first three years (2006-2008), documents were from the publicly available 7 million document tobacco litigation Master Settlement Agreement database.  Since 2009, publicly available Enron data sets have been used.  Participating teams of information scientists from around the world and U.S. litigation support service providers have contributed computer runs attempting to identify responsive (or privileged) documents.

26 The TREC Interactive Task  The Interactive Task was introduced in 2008, and repeated in 2009 and 2010.  It models a document review for responsiveness.  It begins with a mock complaint and associated requests for production (“topics”).  It has a single Topic Authority (“TA”) for each topic.  Teams may interact with the Topic Authority for up to 10 hours.  Each team must submit a binary (“responsive” / “unresponsive”) decision for each and every document in the collection for their assigned topic(s).  It provides for a two-step assessment and adjudication process for the gold standard: where the team and assessor agree on coding, the coding decision is deemed correct; where the team and assessor disagree on coding, appeal is made to the Topic Authority who determines which coding decision is correct. TREC

27 Effectiveness of Technology-Assisted Review at TREC 2009

28 Manual Versus Technology-Assisted Review

29 But!  Roitblat, Voorees, and the TREC 2009 Interactive Task all used different datasets, different topics, and different gold standards, so we cannot directly compare them.  While technology-assisted review appears to be at least as good as manual review, we need to control for these differences.

30 Effectiveness of Manual Versus Technology-Assisted Review

31 So, Technology-Assisted Review is at Least as Effective as Manual Review, But is it More Efficient?

32 Efficiency of Technology-Assisted Versus Exhaustive Manual Review  Exhaustive manual review involves coding 100% of the documents, while technology- assisted review involves coding of between 0.5% (Topic 203) and 5% (Topic 207) of the documents.  Therefore, on average, technology-assisted review is 50 times more efficient than exhaustive manual review.

33 Why Are Humans So Lousy at Document Review?

34 Topic 204 (TREC 2009)  Document Request  All documents or communications that describe, discuss, refer to, report on, or relate to any intentions, plans, efforts, or activities involving the alteration, destruction, retention, lack of retention, deletion, or shredding of documents or other evidence, whether in hard-copy or electronic form.  Topic Authority  Maura R. Grossman (Wachtell, Lipton, Rosen & Katz)

35 Inarguable Error for Topic 204

36 Interpretation Error for Topic 204

37 Arguable Error for Topic 204

38 Topic 207 (TREC 2009)  Document Request  All documents or communications that describe, discuss, refer to, report on, or relate to fantasy football, gambling on football, and related activities, including but not limited to, football teams, football players, football games, football statistics, and football performance.  Topic Authority  K. Krasnow Waterman (LawTechIntersect, LLC)

39 Inarguable Error for Topic 207

40 Interpretation Error for Topic 207

41 Arguable Error for Topic 207

42 Types of Manual Coding Errors

43 Take-Away Messages  Technology-assisted review finds at least as many responsive documents as exhaustive manual review (meaning that recall is at least as good).  Technology-assisted review is more accurate than exhaustive manual review (meaning that precision is much better).  Technology-assisted review is orders of magnitude more efficient than manual review (meaning that it is quicker and cheaper).

44 Measurement is Key  Not all technology-assisted review (and not all exhaustive manual review) is created equal.  Measurement is important in selecting and defending an e-discovery strategy.  Measurement also is critical in discovering better search methods and tools.

45 Additional Resources  TREC  http://trec.nist.gov/  TREC Legal Track  http://trec-legal.umiacs.umd.edu/  TREC 2008 Overview  http://trec.nist.gov/pubs/trec17/papers/LEGAL.OVERVIEW08.pdf  TREC 2009 Overview  http://trec.nist.gov/pubs/trec18/papers/LEGAL09.OVERVIEW.pdf  TREC 2010 Overview  Forthcoming (April 2011) at http://trec-legal.umiacs.umd.edu/  Maura R. Grossman & Gordon V. Cormack, Technology-Assisted Review Can be More Effective and More Efficient Than Exhaustive Manual Review, XVII:3 Richmond Journal of Law & Technology (Spring 2011) (in press).

46 Questions? Thank You!

Technology-Assisted Review Can be More Effective and More Efficient Than Exhaustive Manual Review Gordon V. Cormack University of Waterloo

Similar presentations

Presentation on theme: "Technology-Assisted Review Can be More Effective and More Efficient Than Exhaustive Manual Review Gordon V. Cormack University of Waterloo"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Technology-Assisted Review Can be More Effective and More Efficient Than Exhaustive Manual Review Gordon V. Cormack University of Waterloo

Similar presentations

Presentation on theme: "Technology-Assisted Review Can be More Effective and More Efficient Than Exhaustive Manual Review Gordon V. Cormack University of Waterloo"— Presentation transcript:

Similar presentations

About project

Feedback