Presentation is loading. Please wait.

Presentation is loading. Please wait.

Towards a Personal Briefing Assistant

Similar presentations


Presentation on theme: "Towards a Personal Briefing Assistant"— Presentation transcript:

1 Towards a Personal Briefing Assistant
Nikesh Garera LTI Student Research Symposium – 24th Sep 2004 Carnegie Mellon University

2 Creating Briefings/Reports in academia
Report Writer Funding bodies Peer groups Students

3 How can Briefing Assistant help ?
Weekly participant interviews Learned rules Model parameters Automatic Summarizer Reordered List (items ranked according to importance) 1. _____ 2. _____ 3. _____ Learning module Creates Summary Select/Deselect items Specify rules Synthesize items

4 Domain: Weekly Project Interviews
- Concrete Achievements e.g. Presented Project Plan to visiting project managers on 20th May Problems and Solutions e.g. May be neglecting the idea of "audience". All communication is centered on user request and neglects that the content producer has a point of view. - People and Places e.g. Met with X and Y, founders of Z, which is in the Knowledge Management business. It appears that the system and the tools they have built may be of use to us. - News and Views e.g. Read about the Shaken/KM knowledge base system - Artifacts and Processes e.g. Learned about PLONE – System for collaborative construction of websites - Project Status e.g. We are two weeks behind schedule in implementation of the prototype. - General Activity Report e.g. Considered attaching a "spy" module that tracks ALL of user’s keystrokes and mouse clicks; purpose is to see if we can mine these data. Weekly log of project-related activities Structured interview self-administered administrative assistant Refining of corpus lead to a set of 12 weekly interviews (Average of items per week)

5 Identifying “Research-Acts”
Domain features (Research-acts) Tasks/Actions Action-Item Aspiration Commitment Evaluation Event Hardware Human-Relations Idea Info-gain Issue Object-gain Observation Other Part-Product Product Process Publication Status Named Entities Organization Person Software Meta features Author

6 Design of Evaluation Experiment
Seven “expert” subjects participated in the study Subjects create 5-item summaries for 12 successive weeks Summarization model is incrementally trained from week-to-week. (using logistic regression classifier)

7 Evaluation Metrics Need to evaluate a ranked-list given binary classification for each item in the list. Average Precision Top 30% (% summary covered in top 30 % of the ranked list) Baseline Random order (Avg prec: 0.265, Top 30%: 0.259) Conventional heuristic based summarization methods do not give much leverage. (Centroid based summarization: Avg Prec: 0.25, Top 30%: 0.3) Elapsed Time: (Time taken to create a summary from a ranked list)

8 Evaluation Results Elapsed Time:
Metric Baseline (random) 6th biweek Improvement (factor) Avg Prec 0.265 0.586 2.21 Top 30% 0.259 0.729 2.81 Elapsed Time: Subjects also show a significant 31% decrease in task time over the 6 biweeks but this can be attributed to other factors. baseline

9 Different people produce different summaries
Kappa statistic (Inter-rater agreement): K = (P(A) – P(E)) / (1 – P(E)) P(A) = proportion of times n raters agree P(E) = proportion of times we would expect the n raters to agree by chance Low inter-rater agreement (K = 0.26, n=7) for summary items selected Different strokes for different folks !

10 Differences in feature weights
Subject A OBJECT-GAIN -28.96 INFO-GAIN -10.44 ASPIRATION -9.68 HUMAN-RELATIONS -9.22 HARDWARE 7.19 ACTION-ITEM 6.59 OTHER -5.1 PRODUCT 4.24 PART-PRODUCT 3.79 EVALUATION 3.77 IDEA 2.65 author=aaaaaa -2.48 Subject D PUBLICATION -7.54 OBJECT-GAIN -5.67 OTHER -5.15 HUMAN-RELATIONS 4.85 HARDWARE 4.11 COMMITMENT 3.9 EVALUATION 3.78 PROCESS -3.61 IDEA 3.5 PRODUCT 2.56 ASPIRATION -2.53 author=bbbbbb -1.96 Top 12 features for two different subjects (Features are ordered by absolute weight in the model)

11 Is this approach domain independent ?
The assistance framework is fairly general but the features are manually annotated. We have investigated two techniques: Use information extraction approaches Such as in named-entity extraction And based on the 22 features previously described Use bag of words as features

12 Results: Using automatic feature extraction
Smoothing of corpus Stemming, Removing stop-words Tokenization of known domain terms. A simple rote classifier was trained using cross validation to automatically annotate features. Logistic regression classifier used for learning subject models. baseline Metric Baseline (random) 6th biweek Improvement (factor) Avg Prec 0.265 0.473 1.785 Top 30% 0.259 0.614 2.36

13 Results: using bag of words as features
Features: values were tf/idf weights of the respective words (L1-norm) Voted Perceptron classifier used for learning subject models (due to high dimensionality of feature space) baseline Metric Baseline (random) 6th biweek Improvement Avg Prec 0.265 0.413 1.56 Top 30% 0.259 0.4 1.54

14 Summary Looked at the application of Briefing Assistant and identified several research problems. Proposed a learning based framework for assisting the report writer in creating “tailored” reports. Evaluation study Indicates potential for increase in user productivity Also got encouraging initial results for generalizing this approach for different domains.

15 Interesting problems…
What if dynamics of the project change ? How can a report writer instruct the system ? How can the system proactively acquire relevant information ?

16 Thank you


Download ppt "Towards a Personal Briefing Assistant"

Similar presentations


Ads by Google