Background: The Big Data era

Background: The Big Data era
Companies Generate Data at Staggering Rates Global traffic creates 183 billion new messages every day. ExxonMobil’s employees create 5.2 million new messages every day. 30% annual growth rate of corporate data. By 2020: 26 billion devices will be connected to the internet (that’s more than three devices for every person on the planet).

Case Study 1 Reducing Data Volumes in Cross-Border Litigations and Investigations

Analytics in cross-border matters
The Client: A Taiwanese Fortune 500 manufacturer facing a global corruption investigation and represented by US-based counsel. The Challenge: Approximately 8.5 million documents in various languages requiring analysis and review under very tight deadlines. The Solution: Traditional Culling: 8,462,117 documents reduced to 1,651,094 through deduplication and application of search terms and date filters (80.5% cull rate) Advanced Analytics Culling: using “concept clustering” and “find more like this” functionality, an additional 1,387,350 documents were removed from the review set, leaving only 263,744 to be reviewed (96.4% total cull rate)

The Results: Volume Reduction: 8,462,117 initial documents reduced to only 263,744 over the course of two days (a total cull rate of 96.4% ) Cost Savings: $1,165,374 in attorney fees saved (at average rate of 50 documents/hour charged at $42/hour for contract review attorneys). Time Savings: 27,747 hours of attorney review saved (at an average review rate of 50 documents/hour)

Case Study 2 Establishing Personal Jurisdiction Through Analytics

Establishing Personal Jurisdiction Through Analytics
The Client: Global Plaintiffs’ Litigation Boutique The Case: Large class-action against dozens of international banks alleged to have participated in a global rate-fixing scandal. The Challenge: At the outset of the case, the defendants produced over 1.5 million documents. Shortly thereafter, certain defendants moved to dismiss for lack of personal jurisdiction. Plaintiffs had only 60 days to oppose the motion; not enough time to review the documents to identify contacts between the foreign defendants and the United States. The Solution: TransPerfect searched all 1.5 million documents for any communications ( s, text messages, Bloomberg chats, audio recordings, IMs, etc.) involving US-based custodians (based on a list provided by outside counsel). The search returned 324,830 documents, which were provided to Praescient to visually “map” the communications based on IP addresses, phone numbers and addresses in signature blocks to identify all communications between the US and the foreign defendants.

The Analytic Process: Raw case data was ingested into cutting edge link-analysis software, facilitating more robust and efficient analysis of the data. Specifically, large data sets were broken into smaller networks of associates and communications where specific actions ( s, phone calls etc.) could be identified and investigated based on temporal and geographic indicators. The Results: Of 324,830 documents sent to Praescient, 76,800 were identified as containing communications between the US and the foreign defendants. Outside counsel was provided with analytic insights and a graphical representation of the communication flow of those documents, which were also loaded onto an online review platform for review. U.S. Based UK Based U.S. Based

Outcome: Ultimately, the defendants motion was denied, and the analysis had resulted in better investigative insights in less time, reducing hundreds of thousands of documents, s, and internal chats down to a consolidated list of “high-value” incidents and actors.

Case Study 3 Reusing Assets to Save Time and Money

Reusing Assets to Save Time and Money
What is it? Central datastore for eDiscovery metadata across related matters for a large financial institution. Maintain prior privilege and responsiveness calls. Generate custom metadata properties. Analyze Document-level Metadata Across Matters Evaluate privilege screens Identify inconsistent calls for reconciliation Dense Document Analysis Identifying frequently occurring documents Workflow Modeling and Management Inform review workflow based on prior reviews and document properties

View Discovery Calls Across Matters Over 1.4 million documents had been previously coded (3.5 million coding decisions) in 8 related litigations 82% non-responsive; 18% responsive 76% non-privileged; 14% privileged Bank Examination Privilege, Non-Public Personal Information and Hot Use Prior Coding to Alter Workflow in Current Litigation Reduce Tier 2 review by law firm Identified ~10,000 previously Tier 2 reviewed “responsive/non-privileged” documents that hit new search terms and tagged “responsive” by contract attorneys in current review Produced without Tier 2 review except “Hot” Over $75,000 in savings

Enhance Privilege Screen/Quality Control Tested privilege search term list against previously Tier 2 coded privileged documents (~80,000 privileged documents) Very high recall by family (over 98%) Low precision (less than 20%) Eliminating low precision terms had little impact on recall or precision Modifying terms had little impact on recall or precision Worst offenders: “Legal”, “Privilege”, “Privilege”, “Confidential”, “Lawyer” and “Counsel” Conclusion Search terms are not efficient way to identify privileged documents—resulting in costly review process Search terms can be designed with high recall for QC to reduce risk, but with high review costs

Additional Workflow Alterations Identified ~20,000 privileged documents by Tier 1 review Less than 2,000 previously tagged as privileged by Tier 2 Developing additional analytics to help identify privileged and non-privileged documents using metadata For future cases, use data analytics from earlier cases Eliminate Dense Documents from Tier 1 review Very high rate of non-responsiveness (more than 98%) Sampling possible to confirm Use analytics (metadata) to help identify other highly likely non-responsive documents and eliminate from Tier 1 review Use analytics (metadata) to better identify (high precision) privileged, bank examination privileged and hot documents For business line of this financial institution, use for other cases and uses (e.g., compliance, information governance)

Case Study 4 Analytics Based Review

Analytics-Based Review
1. Identify production set based on objective criteria 2. Remove privileged documents 3. Identify key documents through iterative targeted searches Entity search Concept categorization Sentiment analysis Trends & anomalies Machine learning Iterate until reasonably complete

Background: The Big Data era

Similar presentations

Presentation on theme: "Background: The Big Data era"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Background: The Big Data era

Similar presentations

Presentation on theme: "Background: The Big Data era"— Presentation transcript:

Similar presentations

About project

Feedback