Intelligent Email: Reply and Attachment Prediction Mark Dredze, Tova Brooks, Josh Carroll Joshua Magarick, John Blitzer, Fernando Pereira Presented by.

Slides:



Advertisements
Similar presentations
.  The sender and recipient(s) of an message do not have to be online at the same time. When one person sends a message, it is stored on an.
Advertisements

2014 Redrock Software Conference Configuring System Communications s and Text Alerts Iliana Ramos.
Farag Saad i-KNOW 2014 Graz- Austria,
Surrey Public Library Electronic Classrooms Essentials.
Implicit Queries for Vitor R. Carvalho (Joint work with Joshua Goodman, at Microsoft Research)
( ) Basics Apr 2011 Public Computer Center Moore Memorial Library | Greene, NY.
Classification of the aesthetic value of images based on histogram features By Xavier Clements & Tristan Penman Supervisors: Vic Ciesielski, Xiadong Li.
Confidence-Weighted Linear Classification Mark Dredze, Koby Crammer University of Pennsylvania Fernando Pereira Penn  Google.
A Two-Stage Approach to Domain Adaptation for Statistical Classifiers Jing Jiang & ChengXiang Zhai Department of Computer Science University of Illinois.
Scaling Distributed Machine Learning with the BASED ON THE PAPER AND PRESENTATION: SCALING DISTRIBUTED MACHINE LEARNING WITH THE PARAMETER SERVER – GOOGLE,
Introduction to Automatic Classification Shih-Wen (George) Ke 7 th Dec 2005.
BuzzTrack Topic Detection and Tracking in IUI – Intelligent User Interfaces January 2007 Keno Albrecht ETH Zurich Roger Wattenhofer.
ETT 429 Spring ► Hundreds of different methods to access  Two major divisions ► Web-based – accessed through website ► Program-based.
Deep Belief Networks for Spam Filtering
A Framework for Named Entity Recognition in the Open Domain Richard Evans Research Group in Computational Linguistics University of Wolverhampton UK
Goal: Goal: Learn to automatically  File s into folders  Filter spam Motivation  Information overload - we are spending more and more time.
Advanced Last Updated: May Class Outline Part 1 - Review –Review of basics –Review of files and folders Part 2 - Attachments –Sending.
Spam? Not any more !! Detecting spam s using neural networks ECE/CS/ME 539 Project presentation Submitted by Sivanadyan, Thiagarajan.
How to Get The Most Out of Outlook 2003 Michele Schwartzman Division of Customer Support Summer 2006.
Mailbox Cleanup. Preventive Measures ●Security: Unlimited mailbox sizes opens RCG to a potential denial- of-service.
© 2006 Cisco Systems, Inc. All rights reserved. CUDN v1.1—4-1 Designing VPIM Solutions Migrating Voice Mail to Unified Messaging and Interoperability.
SVMLight SVMLight is an implementation of Support Vector Machine (SVM) in C. Download source from :
1 K-nearest neighbor methods William Cohen April 2008.
Distributional Part-of-Speech Tagging Hinrich Schütze CSLI, Ventura Hall Stanford, CA , USA NLP Applications.
A beginner’s guide to Webmail. What do you need? A computer, or a smartphone An internet connection An account with an service provider.
Software Evaluation Catherine McKeveney Medical Informatics 1st March 2000.
(*Fax messaging is available only upon request; fees apply.) What Is Unified Messaging? Voice, fax* and messaging within a single interface Access.
Enron Corpus: A New Dataset for Classification By Bryan Klimt and Yiming Yang CEAS 2004 Presented by Will Lee.
Crowdsourcing for Spoken Dialogue System Evaluation Ling 575 Spoken Dialog April 30, 2015.
 Copyright 2008 Digital Enterprise Research Institute. All rights reserved. Semantic on the Social Semantic Desktop.
SOCIAL NETWORKS ANALYSIS SEMINAR INTRODUCTORY LECTURE #2 Danny Hendler and Yehonatan Cohen Advanced Topics in on-line Social Networks Analysis.
Machine learning system design Prioritizing what to work on
Summarizing Conversations with Clue Words Giuseppe Carenini Raymond T. Ng Xiaodong Zhou Department of Computer Science Univ. of British Columbia.
Basic Features and Options Accessing  Means of communicating electronically via the Internet.  Used by individuals, businesses,
Module 7 Planning and Deploying Messaging Compliance.
LOGO Summarizing Conversations with Clue Words Giuseppe Carenini, Raymond T. Ng, Xiaodong Zhou (WWW ’07) Advisor : Dr. Koh Jia-Ling Speaker : Tu.
Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification John Blitzer, Mark Dredze and Fernando Pereira University.
A Repetition Based Measure for Verification of Text Collections and for Text Categorization Dmitry V.Khmelev Department of Mathematics, University of Toronto.
Managing Your Inbox. Flagging Messages Message requires a specific response or action from the recipient Flagging draws attention to your request Quick.
Stephanie McFarland Knowledge Management Systems February 22, 2005.
Living Online Lesson 3 Using the Internet IC3 Basics Internet and Computing Core Certification Ambrose, Bergerud, Buscge, Morrison, Wells-Pusins.
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
Recognizing Stances in Online Debates Unsupervised opinion analysis method for debate-side classification. Mine the web to learn associations that are.
Learning Intentions: To understand what is required to achieve a Pass, Merit or Distinction for Task 3.
Machine Learning Tutorial-2. Recall, Precision, F-measure, Accuracy Ch. 5.
Do Now: Describe the steps used to access the comments tool in MS Word. ( review your notes for the answer) Ex: Step 1. Select the text or item you want.
Chapter 7 Writing Memos, , and Letters
Using Game Reviews to Recommend Games Michael Meidl, Steven Lytinen DePaul University School of Computing, Chicago IL Kevin Raison Chatsubo Labs, Seattle.
Knowledge Management Systems Week 5 Schedule -Syllabus Updates Questions Assignments -Blogging More Commentary Evaluations of the blog process - .
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Classification Results for Folder Classification on Enron Dataset.
Opinion spam and Analysis 소프트웨어공학 연구실 G 최효린 1 / 35.
Virtual Local Area Networks In Security By Mark Reed.
Using Social Media to Enhance Emergency Situation Awareness
Machine Learning – Classification David Fenyő
Course Coordinator Training
Microsoft Outlook By: Phuong Nguyen.
Outlook 2003.
An Overview.
“Send Note” GERS ver2.1, January 26, 2010.
Introduction Task: extracting relational facts from text
“PERFORM: Let’s Get Started”
Evaluating Classifiers
Basic Features and Options
Basic Features and Options
Dropsuite vs Office 365 Archiving
Introduction to Sentiment Analysis
Analysis on Accelerated Learning Cohorts
Information Organization: Evaluation of Classification Performance
Presentation transcript:

Intelligent Reply and Attachment Prediction Mark Dredze, Tova Brooks, Josh Carroll Joshua Magarick, John Blitzer, Fernando Pereira Presented by Nareg Torosian

What’s the use?  Whittaker & Sidner’s “ overload” Task management Personal archiving Asynchronous communication  Assist overwhelmed users  Support enhanced interface

Intelligent? How?  Prediction tasks treated as binary classification problems Binary vector, where each dimension represents a feature  Learning performed with logistic regression  System evaluated using F 1, harmonic mean of precision and recall  Single-user (adaptive) and cross-user (adaptable) settings

Reply prediction  Indicate which messages require reply  Allow user to manage these messages

Reply prediction features  Relational features Based on user profile  # of sent and received messages, address book, address and domain I appear in the CC list, I frequently reply to this user, etc. 200 in Dredze et al.’s experiment  Document features Presence of question marks and question words  TF-IDF (term frequency – inverse document frequency) scores Presence of attachments 14,800 in Dredze et al.’s experiment

The grand experiment  Evaluated on 4 user mailboxes  Users manually tagged messages as either needs reply or does not need reply “It is not surprising that overwhelmed users acknowledge that a message did require their reply even though they failed to do so; classifiers trained on actual user reply behavior are thus very poor.”  2,391 total s, excluding spam  80/20 train/test split

The single-user results

The cross-user results  Only relational features were effective, so others omitted

Attachment prediction  “See attachment…hey, wait a minute…”  Possible UI considerations Document sidebar Alert user before sending  Indicate which messages need attachments

Attachment prediction features  Relational features Based on user profile  # of sent and received messages, # of attachments, address and domain Conjunctions between volume of messages/attachments and TO/CC fields 72 in Dredze et al.’s experiment  Document features Presence and placement of “attach” Presence of attachments 39,308 in Dredze et al.’s experiment

The grander experiment  Evaluated on publicly available Enron corpus 150 users and 250,000 s Lots of cleanup needed  Users manually tagged messages as needs attachment Only popular document formats Forwarded messages excluded  Subset of 15,000 messages from 144 users 1,020 with attachments  10-fold cross validation

The results

GUEPs and CDs  GUEPs Mental model Improvement Consistency  CDs Premature commitment Hidden dependencies Abstraction Consistency Provisionality