Www.encase.com/ceic Defensible Quality Control for E-Discovery Geoff Black and Albert Barsocchini.

Slides:

Advertisements

Similar presentations

Survey design. What is a survey?? Asking questions – questionnaires Finding out things about people Simple things – lots of people What things? What people?

Advertisements

How are we doing? Sort the types of error into sampling and non-sampling errors, then match the situations to the types of error.

Project leaders will keep track of team progress using an A3 Report.

Dr. G. Johnson, Sampling Demystified: Sample Size and Errors Research Methods for Public Administrators Dr. Gail Johnson.

Estimation of Sample Size

Estimating a Population Proportion

Online Performance Auditing Using Hot Optimizations Without Getting Burned Jeremy Lau (UCSD, IBM) Matthew Arnold (IBM) Michael Hind (IBM) Brad Calder (UCSD)

Why Differences are Important

INFO 624 Week 3 Retrieval System Evaluation

CHAPTER 8 Estimating with Confidence

Cover Letters for Survey Research Studies

New York | London | Munich | Sydney | Tokyo Cost-Effective International Patenting Strategies: Expand Your Global Opportunity Presented by Jeff Sweetman.

Testing Hypotheses About Proportions

Sampling : Error and bias. Sampling definitions  Sampling universe  Sampling frame  Sampling unit  Basic sampling unit or elementary unit  Sampling.

E-Discovery: Not Grandfather’s Litigation Finding Your Client’s Narrative.

Volunteer Angler Data Collection and Methods of Inference Kristen Olson University of Nebraska-Lincoln February 2,

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 9 Section 1 – Slide 1 of 39 Chapter 9 Section 1 The Logic in Constructing Confidence Intervals.

Marco Nasca Senior Director, Client Solutions TRANSFORMING DISCOVERY THROUGH DATA MANAGEMENT.

Nathan Walker building an ediscovery framework. armasv.org Objective Present an IT-centric perspective to consider when building an eDiscovery framework.

Generic Approaches to Model Validation Presented at Growth Model User’s Group August 10, 2005 David K. Walters.

+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.2 Estimating a Population Proportion.

In this chapter we introduce the ideas of confidence intervals and look at how to construct one for a single population proportion.

1 How do we interpret Confidence Intervals (Merit)? A 95% Confidence Interval DOES NOT mean that there is a 95 % probability that the population mean lies.

1 Psych 5500/6500 The t Test for a Single Group Mean (Part 1): Two-tail Tests & Confidence Intervals Fall, 2008.

CHAPTER 8 Estimating with Confidence

SAMPLING TECHNIQUES AND METHODS ‘CHAR’ FMCB SEMINAR PRESENTER: DR KAYODE. A. ONAWOLA 03/07/2013.

6.1 Inference for a Single Proportion  Statistical confidence  Confidence intervals  How confidence intervals behave.

Chapter 8: Estimating with Confidence Section 8.2 Estimating a Population Proportion.

The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.2 Estimating a Population Proportion.

 The point estimators of population parameters ( and in our case) are random variables and they follow a normal distribution. Their expected values are.

T T Population Sample Size Calculations Purpose Allows the analyst to analyze the sample size necessary to conduct "statistically significant"

Section 8.3 ~ Estimating Population Proportions Introduction to Probability and Statistics Ms. Young.

Chapter 11: Estimation of Population Means. We’ll examine two types of estimates: point estimates and interval estimates.

United Nations Oslo City Group on Energy Statistics OG7, Helsinki, Finland October 2012 ESCM Chapter 8: Data Quality and Meta Data 1.

Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.

+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Unit 5: Estimating with Confidence Section 12.1 Estimating a Population Proportion.

+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.2 Estimating a Population Proportion.

+ Chapter 8: Estimating with Confidence Section 8.2 Estimating a Population Proportion.

Confidence Intervals Chapter 10. Confidence Intervals: The Basics Section 10.1.

Critical Values and Confidence Intervals. What you’ve been doing…  Gathering Data  Calculating Data  Interpreting Data With surveys, experiments, and.

MATH Section 7.2.

CHAPTER 8 (4 TH EDITION) ESTIMATING WITH CONFIDENCE Section 8.2.

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

Accessing Spatial Information from MaineDOT

VCS-413 Dumps PDF Administration of Veritas eDiscovery Platform 8.2 for Administrators Get VCS-413 Braindumps & VCS-413 Real Exam Questions.

Polling If the individuals in the population differ in some qualitative way, we often wish to estimate the proportion / fraction / percentage of the population.

Chapter 8: Estimating with Confidence

CHAPTER 8 Estimating with Confidence

Chapter 8: Estimating with Confidence

CHAPTER 8 Estimating with Confidence

CHAPTER 8 Estimating with Confidence

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

CHAPTER 8 Estimating with Confidence

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

2/3/ Estimating a Population Proportion.

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

CHAPTER 8 Estimating with Confidence

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

CHAPTER 8 Estimating with Confidence

Chapter 8: Estimating with Confidence

Presentation transcript:

Defensible Quality Control for E-Discovery Geoff Black and Albert Barsocchini

How do you defend your collections? Question for the Audience A Question for the Audience Page 2

How Much to Collect Page 3 1.Full Disk Image – safe, but costly and time consuming 2.User-Created Data – probably the most often used in discovery 3.Targeted Collections Based on Early Case Assessment – the current trend

Let the downstream tools (processing, filtering, review) do the work. Sampling is still beneficial for all of these collection methods. Full Disk Image | User-Created Data | Targeted How Much to Collect Page 4

Legal Trends in Discovery 5 More discovery about discovery More sanction decisions Utilizing more than one methodology or technology at different stages of the process Transparency in the discovery process Courts expect attorneys to understand available technology and use it

Legal Trends in Discovery 6 The increased use of lawyers with practices focused on eDiscovery Attorneys must demonstrate that the discovery process used is defensible and reasonable Increased adoption of predictive coding Courts expect discovery to be proportional to the case Still no single "magic bullet" to solve the challenges of discovery

Legal Trends in Discovery 7 Increased adoption of information governance programs, including defensible disposal of data. Proliferation of data sources The days of granting carte blanche discovery are over More use of early case assessment

Ensure Quality and accuracy of the collection or of the processing results Defensibility Sampling – Why Do It?

Judgmental – subjectively defined data set Statistical – randomly selected data Types of Sampling

Select appropriate filters for the target data set Accomplishing a high confidence level and low margin of error The Challenges

Also known as the “confidence interval” How closely results will reflect the general population Lower margin of error is obviously better Statistics – Margin of Error

We have 100 documents and our margin of error is ± 2% Testing shows 10% responsiveness So… the general population should show between 8% and 12% responsiveness, or 8 to 12 documents. Statistics – Margin of Error

Does the sample accurately represent the results of general population? Higher confidence level is better Statistics – Confidence Level

What does a 95% Confidence Level mean? 95 out of 100 times, the population will match our sample’s results Gallup Polls: 98% accuracy in Presidential elections Statistics – Confidence Level

%

What’s The Catch?

You must filter out documents that you know for sure contain nothing of value:.exe,.dll, etc. What’s The Catch?

Statistics for eDiscovery Sample Sizes for Population of 1,000,000 Margin of Error

[Scaling] Statistics for eDiscovery Population Size

“Every cook knows that it only takes a single sip from a well-stirred soup to determine the taste.” You can visualize what happens when the soup is poorly stirred. If well-stirred, a single sip is sufficient both for a small pot and a large pot. [Scaling] Statistics for eDiscovery

Finding a good search method is difficult Who chooses search terms? Requires iterative testing and validation Sampling Workflow

Select Random Sample Review Sample for Relevance Search sample with proposed keywords Compare results Extrapolate expected relevance and error rates on data set Can be done in parallel

Sampling Workflow Select Random Sample Review Sample for Relevance Search sample with proposed keywords Compare results Extrapolate expected relevance and error rates on data set Can be done in parallel Iterate keywords, and re-test as necessary

Wait a minute, I always test my keywords! Remember: It’s not whether you test, but what you test on… Sampling Workflow

Small dataset for testing Minimize false positives More accurate search, reduced data volume Defensibility of statistically validated testing Sampling Benefits

Saves the cost of loading into review platform All steps performed in EnCase for collection, processing, and review Requires an external EnScript for sampling Extra step to import random sample results back into ECC Review capabilities less than ideal Using ECC for Random Sampling Page 26 ProsCons

EnCase eDiscovery Workflow Hands-On Collect Data in ECC eDocs L01s (Entries) Fork to eDocs and L01s L01s (Records) Random Sampler EnScript Sample eDocs L01s Sample L01s Review & Test

EnCase eDiscovery Workflow Hands-On Collect Data in ECC eDocs L01s (Entries) Fork to eDocs and L01s L01s (Records) Random Sampler EnScript Sample eDocs L01s Sample L01s Review & Test

EnCase eDiscovery Workflow Hands-On Collect Data in ECC eDocs L01s (Entries) Fork to eDocs and L01s L01s (Records) Random Sampler EnScript Sample eDocs L01s Sample L01s Review & Test

EnCase eDiscovery Workflow Hands-On Collect Data in ECC eDocs L01s (Entries) Fork to eDocs and L01s L01s (Records) Random Sampler EnScript Sample eDocs L01s Sample L01s Review & Test

EnCase eDiscovery Workflow Hands-On Page 31 What is a “Workflow” in EnCase eDiscovery?

EnCase eDiscovery Workflow Hands-On Collect Data in ECC Random Sampler EnScript Sample eDocs L01s Sample L01s Review & Test eDocs L01s (Entries) Fork to eDocs and L01s L01s (Records)

EnCase eDiscovery Workflow Hands-On Page 33 Look good? WF Processed eDocs WF Collected eDocs WF Forked WF Forked eDocs WF Processed Fork from eDocs Process Process eDocs

EnCase eDiscovery Workflow Hands-On Page 34 Survey says… WF Processed eDocs WF Collected eDocs WF Forked WF Forked eDocs WF Processed Fork from eDocs Process Process eDocs

EnCase eDiscovery Workflow Hands-On Page 35

EnCase eDiscovery Workflow Hands-On Page 36 Magic

EnCase eDiscovery Workflow Hands-On Page 37

EnCase eDiscovery Workflow Hands-On Page 38

External EnScript, not a part of EnCase eDiscovery Uses known formulas to determine sample size Preferred input is L01's created by EnCase eDiscovery Auto-detects the L01 type - Entries vs Records/ Creates a random sample across all of the L01's and outputs items to new sample L01's (“*.SAMPLES.L01”) Random Sampler EnScript Hands-On

Sampling can be performed directly in the review platform Robust reviewer and oversight capabilities Once the data is in the review platform, you don’t need to go back to EnCase Extra costs associated Split workflow requires moving data outside of EnCase and into review platform Using Review Platforms for Sampling Page 41 ProsCons

Statistical Sampling With Relativity

Statistical Sampling With Clearwell

Contact Info & Download Page 60 Geoff Black Product Manager, Digital Forensics Stroz Friedberg LLC Albert Barsocchini Discovery Counsel & Director of Strategic Consulting NightOwl Discovery

Thank You Geoff Black and Albert Barsocchini