Download presentation
Presentation is loading. Please wait.
Published byDella Hall Modified over 9 years ago
1
Intelligent Email: Reply and Attachment Prediction Mark Dredze, Tova Brooks, Josh Carroll Joshua Magarick, John Blitzer, Fernando Pereira Presented by Nareg Torosian
2
What’s the use? Whittaker & Sidner’s “email overload” Task management Personal archiving Asynchronous communication Assist overwhelmed email users Support enhanced email interface
3
Intelligent? How? Prediction tasks treated as binary classification problems Binary vector, where each dimension represents a feature Learning performed with logistic regression System evaluated using F 1, harmonic mean of precision and recall Single-user (adaptive) and cross-user (adaptable) settings
4
Reply prediction Indicate which messages require reply Allow user to manage these messages
5
Reply prediction features Relational features Based on user profile # of sent and received messages, address book, email address and domain I appear in the CC list, I frequently reply to this user, etc. 200 in Dredze et al.’s experiment Document features Presence of question marks and question words TF-IDF (term frequency – inverse document frequency) scores Presence of attachments 14,800 in Dredze et al.’s experiment
6
The grand experiment Evaluated on 4 user mailboxes Users manually tagged messages as either needs reply or does not need reply “It is not surprising that overwhelmed users acknowledge that a message did require their reply even though they failed to do so; classifiers trained on actual user reply behavior are thus very poor.” 2,391 total emails, excluding spam 80/20 train/test split
7
The single-user results
8
The cross-user results Only relational features were effective, so others omitted
9
Attachment prediction “See attachment…hey, wait a minute…” Possible UI considerations Document sidebar Alert user before sending Indicate which messages need attachments
10
Attachment prediction features Relational features Based on user profile # of sent and received messages, # of attachments, email address and domain Conjunctions between volume of messages/attachments and TO/CC fields 72 in Dredze et al.’s experiment Document features Presence and placement of “attach” Presence of attachments 39,308 in Dredze et al.’s experiment
11
The grander experiment Evaluated on publicly available Enron email corpus 150 users and 250,000 emails Lots of cleanup needed Users manually tagged messages as needs attachment Only popular document formats Forwarded messages excluded Subset of 15,000 messages from 144 users 1,020 with attachments 10-fold cross validation
12
The results
13
GUEPs and CDs GUEPs Mental model Improvement Consistency CDs Premature commitment Hidden dependencies Abstraction Consistency Provisionality
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.