Presentation is loading. Please wait.

Presentation is loading. Please wait.

UROP Research Update Citation Function Classification Eric Yulianto A0069442B 22 February 2013.

Similar presentations


Presentation on theme: "UROP Research Update Citation Function Classification Eric Yulianto A0069442B 22 February 2013."— Presentation transcript:

1 UROP Research Update Citation Function Classification Eric Yulianto A0069442B 22 February 2013

2 Outline Motivation Problem Related Work Current Progress Follow Up

3 Motivation To assist researchers during paper review process. Quick categorization with minimal amount of reading. Help prioritize more important papers.

4 Problem Given a citation on a paper.  What is the purpose of the citation? Need to repeatedly read a section of the paper.  Intention may not be obvious from the citation sentence.

5 Example Excerpt from (Busemann, Schmeier, & Arens, 2000)  SVMs are described in (Vapnik, 1995). SVMs are binary learners in that they distinguish positive and negative examples for each class. (neutral context)  In all experiments the SVM Light system outperformed other learning algorithms, which confirms Yang’s (Yang and Liu, 1999) results for SVM's fed with Reuters data. (positive context)

6 Related Work Teufel et al., 2006  Feature used: Cue phrases Verb Clusters Verb Tense Modality Self-citation indicator  Ibk/k-Nearest Neighbour Algorithm  Accuracy: 77%

7 Related Work Angrosh et al., 2010  Citation classification => Sentence classification  Related Work Section only.  Feature Used: Word Category. Presence of citation in previous sentence.  Conditional random field.  Generally perform well: Accuracy: 96.51%.  Did not perform well on citation sentence.

8 Related Work Dong and Schafer, 2011  Feature used: Cue words. Physical: Location,Popularity,Density,AvgDens. Sentence syntax  Ensemble-style self-training algorithm.

9 Current Progress (Analysis) Citation scheme  Adopt and modify the scheme done in Teufel et al., 2006.  12 classes => 4 classes. Weakness CompareContrast Positive Neutral

10 Current Progress (Analysis) Dataset  ANLP Conference from ACL Anthology.  Context extracted from ParsCit output.  Distribution: 609 citations Weakness:30 CompareContrast:72 Positive:236 Neutral:271

11 Current Progress (Analysis) Classification Algorithm  Weka Implementation of Naive Bayes and SVM  Uses chi-square attribute selection filter

12 Current Progress (Analysis) Feature Used and Tested:  Cue Words  Cue Words + chi-square filter  Word Categories (Angrosh et al., 2010)

13 Current Progress (Analysis) Feature Used NaiveBaye s SVM Cue Words 64.37%67.16% Cue Words + filter 66.48%68.95% Angrosh reimplementation 51.24%49.90%

14 Ongoing Process Feature extracted but not yet tested:  Physical Features (Dong and Schafer, 2011) Location Density Popularity  Author and Title Information  Publication Year

15 Follow Up Add more features that can help differentiate the citation functions. Larger dataset Split the classification into two stages: – Use the metadata(physical features, author information, title information, publication year) – Use the cue words to refine the classification

16 Thank You


Download ppt "UROP Research Update Citation Function Classification Eric Yulianto A0069442B 22 February 2013."

Similar presentations


Ads by Google