Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Study of Learning a Merge Model for Multilingual Information.

Similar presentations


Presentation on theme: "Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Study of Learning a Merge Model for Multilingual Information."— Presentation transcript:

1 Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Study of Learning a Merge Model for Multilingual Information Retrieval Presenter : Cheng-Hui Chen Author : Ming-Feng Tsai, Yu-Ting Wang, Hsin-Hsi Chen SIGIR 2008

2 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 2 Outlines Motivation Objectives Methodology Experiments Conclusions Comments

3 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 3 Motivation  Multilingual information retrieval (MLIR) that result list usually includes more irrelevant words.  Traditional merging methods for MLIR that assumption relevant documents are homogeneously distributed over monolingual result lists.

4 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Objectives  The various translation and retrieval qualities in different collections that to merge a unique result list.  To proposes merge method doesn’t assumption relevant documents are homogeneously distributed over monolingual result lists.  The enhancement merge model quality. 4

5 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Methodology  Traditional MLIR Framework. ─ Raw-score ─ Round-robin ─ Normalized-by-top1 ─ Normalized-by-top k  The Proposes a learning method. ─ FRank 5

6 Intelligent Database Systems Lab N.Y.U.S.T. I. M. MLIR merge process 6  Feature Set 1.Query levels 2.Document levels 3.Translation levels  The Construction of a Merge Model 1.FRank ranking algorithm 2.BM25

7 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Feature set  Query levels ─ The manually classify the terms within a query into several pre-defined categories. Location/country names (Loc) Organization names (Org) Event names (EN) Technical terms (TT)  Document levels ─ The extracted document length (Dlength) and title length (Tlength). 7

8 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Feature set  Translation levels ─ The size of a bilingual dictionary used for various language (i.e., DictSize). ─ The average number of translation equivalents within a query (i.e., AvgTAD). If a query has two query terms both with three translation equivalents.  AvgTAD of the query is (3 + 3)/2 = 3. 8 AvgTADDicSize (4+2)/2=23 中文 (Translation QT) (Order) (Park) 訂單 公園 順序 停車 命令 隊形 中文翻譯數目 查詢詞的數目 EN Loc EN 斗六 食べる Order 、 Park 英 -> 中 Loc

9 Intelligent Database Systems Lab N.Y.U.S.T. I. M. The Construction of Merge model  The FRank’s generalized additive model, a merge model can be represented as : ─ m t (x) is a weak learner ─ α t is the learned weight ─ t is the number of selected weak learners  The combine with a retreval model (bm25) by using linear combination. 9

10 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Data set ─ The Details of Experimental Collections ─ The Percentage of Retrieved Documents 10

11 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Mean Average Precision (MAP) 11

12 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  The Experimental Results of Our Method using Different Combination Coefficient λ. 12

13 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Experiments  Feature Analysis 13

14 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Conclusions  The proposed merge model can significantly improve merging quality.  The merge model indicates the key factors are the number of translatable terms and compound words. 14

15 Intelligent Database Systems Lab N.Y.U.S.T. I. M. Conclusions  The future work ─ Use other learning-based ranking algorithms. Such as RankSVM and RankNet. ─ Extract more representative features to construct a merge model. Such as linguistic features. ─ Expect to discover more relations within query terms. Such as query term association and substitution. 15

16 Intelligent Database Systems Lab N.Y.U.S.T. I. M. 16 Comments  Advantage ─ Improve merging quality.  Drawback  Application ─ Multilingual Information retrieval.


Download ppt "Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Study of Learning a Merge Model for Multilingual Information."

Similar presentations


Ads by Google