Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 2007.SIGIR.8 New Event Detection Based on Indexing-tree.

Similar presentations


Presentation on theme: "Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 2007.SIGIR.8 New Event Detection Based on Indexing-tree."— Presentation transcript:

1 Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 2007.SIGIR.8 New Event Detection Based on Indexing-tree and Named Entity Advisor : Dr. Hsu Presenter : Hsin-Yi Huang Authors : Zhang Kuo, Li Juan Zi, Wu Gang

2 N.Y.U.S.T. I. M. Intelligent Database Systems Lab 2007/8/15 2 Outline Introduction Motivation Objective Basic New Event Detection (NED) Model News Indexing-tree Term reweighting approach Experiment Conclusion Comments

3 N.Y.U.S.T. I. M. Intelligent Database Systems Lab 2007/8/15 3 Introduction Traditional New Event Detection (NED) System NED model News story stream 1.The decision 2.The confidence of the decision 1.Story Representation (1) Preprocessing (2) Term weight calculation 2.Similarity Calculation 3.Detection Procedure (1)S-S type (2)S-C type marriagestormexplodefilmdiet A 00.80.500 B 0.90.1000 C 001.00.30 D 0.40.2000.1 E 000.50.70

4 N.Y.U.S.T. I. M. Intelligent Database Systems Lab 2007/8/15 4 Motivation How to speed up the detection procedure while do not decrease the detection accuracy? How to make good use of cluster (topic) information to improve accuracy? How to obtain news story representation by better understanding of named entities?

5 N.Y.U.S.T. I. M. Intelligent Database Systems Lab 2007/8/15 5 Objective Efficiency  News Indexing-tree Accuracy  Using of cluster (topic) information  To make use of named entities based on news classification

6 N.Y.U.S.T. I. M. Intelligent Database Systems Lab 2007/8/15 6 Basic NED Model TF-IDF (term frequency–inverse document frequency) Incremental TF-IDF 1 2 …t-1 t

7 N.Y.U.S.T. I. M. Intelligent Database Systems Lab 2007/8/15 7 Basic NED Model (cont.) Similarity Calculation Detection Procedure an new storya old story No Yes

8 N.Y.U.S.T. I. M. Intelligent Database Systems Lab 2007/8/15 8 News Indexing-tree

9 N.Y.U.S.T. I. M. Intelligent Database Systems Lab Term reweighting approach Base on Distribution Distance 2007/8/15 9

10 N.Y.U.S.T. I. M. Intelligent Database Systems Lab Term reweighting approach (cont.) Base on Term Type and Story Class 2007/8/15 10

11 N.Y.U.S.T. I. M. Intelligent Database Systems Lab Term reweighting approach (cont.) 2007/8/15 11

12 N.Y.U.S.T. I. M. Intelligent Database Systems Lab 2007/8/15 12 Experiment Datasets  TDT2 (news story from January to June,1998)  TDT3 (English story from Oct. to Dec.,1998) Evaluation Metric term weight calculateSimilarity CalculationDetection Procedure System-1 incremental TF-IDF Hellinger distance S-S type Ststem-2S-C type Ststem-3 Indexing-tree Ststem-4term distributions Ststem-5 Term Type and Story Class Ststem-6 the other NED systems Ststem-7 Ststem-8

13 N.Y.U.S.T. I. M. Intelligent Database Systems Lab 2007/8/15 13 Experiment (cont.)

14 N.Y.U.S.T. I. M. Intelligent Database Systems Lab 2007/8/15 14 Conclusion To reduce comparing time without hurting NED accuracy. The two extensions contribute to improvement in accuracy. Future work  to collect news set which span for a longer period from internet, and integrate time information in NED task.  to refine cluster granularity to event-level, and identify different events and their relations within a topic

15 N.Y.U.S.T. I. M. Intelligent Database Systems Lab 2007/8/15 15 Comments Advantage  More efficient  More accurate Drawback  Ambiguous signs  Too many parameters Application  …


Download ppt "Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 2007.SIGIR.8 New Event Detection Based on Indexing-tree."

Similar presentations


Ads by Google