The Next Frontier in TAR: Choose Your Own Algorithm

Slides:

Advertisements

Similar presentations

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki

Advertisements

Speed dating Classification What you should know about dating Stephen Cohen Rajesh Ranganath Te Thamrongrattanarit.

Information Retrieval and Organisation Chapter 11 Probabilistic Information Retrieval Dell Zhang Birkbeck, University of London.

Improvements and extras Paul Thomas CSIRO. Overview of the lectures 1.Introduction to information retrieval (IR) 2.Ranked retrieval 3.Probabilistic retrieval.

Language Models Naama Kraus (Modified by Amit Gross) Slides are based on Introduction to Information Retrieval Book by Manning, Raghavan and Schütze.

Relevant characteristics extraction from semantically unstructured data PhD title : Data mining in unstructured data Daniel I. MORARIU, MSc PhD Supervisor:

2 – In previous chapters: – We could design an optimal classifier if we knew the prior probabilities P(wi) and the class- conditional probabilities P(x|wi)

Supervised Learning Recap

Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.

Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.

Bayesian Learning Rong Jin. Outline MAP learning vs. ML learning Minimum description length principle Bayes optimal classifier Bagging.

CS4705 Natural Language Processing.  Regular Expressions  Finite State Automata ◦ Determinism v. non-determinism ◦ (Weighted) Finite State Transducers.

SLIDE 1IS 240 – Spring 2010 Logistic Regression The logistic function: The logistic function is useful because it can take as an input any.

Introduction to Information Retrieval Introduction to Information Retrieval Hinrich Schütze and Christina Lioma Lecture 11: Probabilistic Information Retrieval.

Generative and Discriminative Models in Text Classification David D. Lewis Independent Consultant Chicago, IL, USA

Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.

Visual Recognition Tutorial

Arizona State University DMML Kernel Methods – Gaussian Processes Presented by Shankar Bhargav.

Scalable Text Mining with Sparse Generative Models

Speech Technology Lab Ƅ ɜ: m ɪ ŋ ǝ m EEM4R Spoken Language Processing - Introduction Training HMMs Version 4: February 2005.

Text Classification Using Stochastic Keyword Generation Cong Li, Ji-Rong Wen and Hang Li Microsoft Research Asia August 22nd, 2003.

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

Computer vision: models, learning and inference Chapter 6 Learning and Inference in Vision.

Learning to Rank for Information Retrieval

Step 3: Classification Learn a decision rule (classifier) assigning bag-of-features representations of images to different classes Decision boundary Zebra.

1 Bayesian Learning for Latent Semantic Analysis Jen-Tzung Chien, Meng-Sun Wu and Chia-Sheng Wu Presenter: Hsuan-Sheng Chiu.

Introduction to Machine Learning for Information Retrieval Xiaolong Wang.

Anomaly detection with Bayesian networks Website: John Sandiford.

1 Logistic Regression Adapted from: Tom Mitchell’s Machine Learning Book Evan Wei Xiang and Qiang Yang.

DATA MINING LECTURE 10 Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines.

8/25/05 Cognitive Computations Software Tutorial Page 1 SNoW: Sparse Network of Winnows Presented by Nick Rizzolo.

Document Categorization Problem: given –a collection of documents, and –a taxonomy of subject areas Classification: Determine the subject area(s) most.

Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:

Bayesian Extension to the Language Model for Ad Hoc Information Retrieval Hugo Zaragoza, Djoerd Hiemstra, Michael Tipping Presented by Chen Yi-Ting.

A Probabilistic Graphical Model for Joint Answer Ranking in Question Answering Jeongwoo Ko, Luo Si, Eric Nyberg (SIGIR ’ 07) Speaker: Cho, Chin Wei Advisor:

Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.

CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.

1 Generative and Discriminative Models Jie Tang Department of Computer Science & Technology Tsinghua University 2012.

The Role of Metadata in Machine Learning for TAR Amanda Jones Marzieh Bazrafshan Fernando Delgado Tania Lihatsh Tami Schuyler

Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.

Overview of the final test for CSC Overview PART A: 7 easy questions –You should answer 5 of them. If you answer more we will select 5 at random.

Active learning Haidong Shi, Nanyi Zeng Nov,12,2008.

PhD Dissertation Defense Scaling Up Machine Learning Algorithms to Handle Big Data BY KHALIFEH ALJADDA ADVISOR: PROFESSOR JOHN A. MILLER DEC-2014 Computer.

Supervised Machine Learning: Classification Techniques Chaleece Sandberg Chris Bradley Kyle Walsh.

A Generation Model to Unify Topic Relevance and Lexicon-based Sentiment for Opinion Retrieval Min Zhang, Xinyao Ye Tsinghua University SIGIR

Lecture 5: Statistical Methods for Classification CAP 5415: Computer Vision Fall 2006.

SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.

Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Bayes Rule Mutual Information Conditional.

Introduction to Machine Learning Nir Ailon Lecture 11: Probabilistic Models.

Bayesian Semi-Parametric Multiple Shrinkage

Who am I? Work in Probabilistic Machine Learning Like to teach 

Machine Learning – Classification David Fenyő

Computer vision: models, learning and inference

Chapter 7. Classification and Prediction

Sentiment analysis algorithms and applications: A survey

Lecture 15: Text Classification & Naive Bayes

Data Mining Lecture 11.

Statistical NLP: Lecture 9

From frequency to meaning: vector space models of semantics

CS4705 Natural Language Processing

The Organizational Impacts on Software Quality and Defect Estimation

LECTURE 23: INFORMATION THEORY REVIEW

Semantic Similarity Methods in WordNet and their Application to Information Retrieval on the Web Yizhe Ge.

Machine Learning – a Probabilistic Perspective

Learning to Rank with Ties

Introduction to Sentiment Analysis

Extracting Why Text Segment from Web Based on Grammar-gram

Wil Collins, Will Dickerson Client: Mohamed Magdy and CTRnet

Stance Classification of Ideological Debates

Presentation transcript:

The Next Frontier in TAR: Choose Your Own Algorithm LegalTech New York 2017 Presented by Dr. David Grossman, Georgetown University Tara Emory, Esq., PMP, Director of Consulting for Driven, Inc.

Introduction What’s inside TAR? The Role of the Algorithm Leveraging Algorithms to improve TAR Discovery Process Implications Q&A

What’s inside TAR? Workflow Software Algorithm

What’s inside TAR One software product = one algorithm Attempts to compare algorithms and products do not isolate workflow vs. software vs. algorithm Other variables include document set and nature of what you need to find

Prior work TAR vs. keywords and manual review Teams with different workflows and software on same issues with same documents Industry tests of one software vs. another on same issues and same documents Different algorithms on same issues with same documents Next Frontier: Different algorithms on same documents, compared to same algorithms on different documents

The Role of the Algorithm Prior work TAR vs. keywords and manual review Teams with different workflows and software on same issues with same documents Industry tests of one software vs. another on same issues and same documents Different algorithms on same issues with same documents Next Frontier: Which algorithms work best for different types of cases?

The Role of the Algorithm TREC Issue Winner at 15% Recall at 15% 201 XGBoost CV -Binary 92 202 LSI 98 203 Logistical Regression –log TF-IDF and LSI log TF-IDF 97 207 94

Leveraging Algorithms to Improve TAR No “best” algorithm for all cases Success of different algorithms varies by Size of document set Prevalence of responsiveness in set Amount of review appropriate for case Availability of best examples to train Broad vs. narrow topics Single vs. many issues

Discovery Process Implications How should attorneys adapt to new understandings of TAR algorithms? How does an attorney judge what is reasonable? Should algorithm selection be included in discovery negotiations? Could this be another point of disagreement between opposing parties?

Questions

Supplement

Workflow Sampling (in some workflows) Seed set Training Validation (in some workflows)

Technologies: LSI/LSA Latent Semantic Indexing/Analysis Find relationships between: Words - words Words - topics Topics – documents Map into semantic space

Logistical Regression Just like linear regression except we fit a curve instead of a line. Probability Of Relevance

Technologies: Bayesian Probability Bayesian Probability/Naïve Bayes Probabilistic Identifies probability that a word contributes to a document matches a category, based on examples Each word contributes independently to likelihood

Technologies: SVM Support Vector Machine Process for making binary decisions Documents mapped based on word count expressed as percentage of words in the document As user identifies responsive and non responsive examples, a dividing line is determined

Other Lexical Techniques Rely on linguists and dictionaries Linguists serve as experts and work with attorneys Deconstructs language into parts of speech Determine classification rules for responsiveness and non-responsiveness based on key words May or may not involve machine learning