Presentation is loading. Please wait.

Presentation is loading. Please wait.

Proximity-based Ranking of Biomedical Texts Rey-Long Liu * and Yi-Chih Huang * Dept. of Medical Informatics Tzu Chi University Taiwan.

Similar presentations


Presentation on theme: "Proximity-based Ranking of Biomedical Texts Rey-Long Liu * and Yi-Chih Huang * Dept. of Medical Informatics Tzu Chi University Taiwan."— Presentation transcript:

1 Proximity-based Ranking of Biomedical Texts Rey-Long Liu * and Yi-Chih Huang * Dept. of Medical Informatics Tzu Chi University Taiwan

2 Outline Research background Problem definition The proposed approach: PRE Empirical evaluation Conclusion A Proximity-based Ranker Enhancer2

3 Research Background A Proximity-based Ranker Enhancer3

4 Biomedical Information Need Biomedical research requires relevant evidences in the huge and ever-growing biomedical literature Retrieval of the evidences requires a system that –Accepts a natural language query for a biomedical information need, and –Ranks relevant texts higher for access or processing A Proximity-based Ranker Enhancer4

5 An Example Info Need Query: urinary tract infection, criteria for treatment and admission (from OHSUMED) –A disease as the target concept (i.e., urinary tract infection ) –Two concepts about the scenario of the information need (i.e., treatment and admission ) Neither special nor related to any disease A Proximity-based Ranker Enhancer5

6 Problem Definition A Proximity-based Ranker Enhancer6

7 Goals Explore how text rankers may be improved by considering the completeness of query concepts appearing in a nearby area of the text being ranked Develop a technique PRE (Proximity-based Ranker Enhancer) that –Measures contextual completeness of query concepts appearing in a nearby area in the text –Serves as a supplement to improve existing rankers A Proximity-based Ranker Enhancer7

8 Related Work Biomedical text ranking –Using synonyms and considering diversity of passages, without considering term proximity Text ranking –Individual text scoring techniques (e.g., BM25) and learning to rank techniques (e.g., Ranking SVM), without considering term proximity Improving ranking by term proximity –Term proximity is employed, but contextual completeness was not considered A Proximity-based Ranker Enhancer 8

9 The Proposed Approach: PRE A Proximity-based Ranker Enhancer9

10 System Overview A Proximity-based Ranker Enhancer10 Text Ranker Development Training Testing Underlying RankerPRE Text Ranking TF in d User Query (q) Text (d) TF (Term Frequency) Assessment Training Data Ranked Texts

11 TF Assessment A Proximity-based Ranker Enhancer11 Three types of term proximity –Overall proximity (QTermTF) –Individual proximity (IndiP) –Collective proximity (CollP) A term t may get a large TF increment in d, if –Many query terms appear frequently in d –Query terms are individually near to t at some places, and –Query terms collectively appear at a place near to t

12 A Proximity-based Ranker Enhancer12 RTF(t,d,q) = TF(t,d)+TFincrement(t,d,q) TFincrement(t,d,q) = QtermTF(d, q) IndiP(t, d, q) ×CollP(t, d, q) QtermTF(d,q) = Total TF of query terms in d IndiP(t,d,q) =Σ m  M - { t } SigmoidWeight(Mindist(t,m))/ MaxIndiP Mindist(x,y) = shortest distance between x and y in d SigmoidWeight(dt) = 1/(1+e -( ( |q|-1)-dt) ) CollP(t,d,q) = Max k  K {  m  M - { t } SigmoidWeight(dist(t,k,m))}/MaxCollP, where K is the set positions at which t appears in d dist(t,k,m) = Distance between t (at position k) and m

13 Empirical Evaluation A Proximity-based Ranker Enhancer13

14 Experimental Data OHSUMED –A popular database of biomedical queries and references – 106 queries –348,566 references –16,140 query-reference pairs Definitively relevant Possibly relevant Not relevant A Proximity-based Ranker Enhancer14

15 Underlying Rankers A Proximity-based Ranker Enhancer15

16 Baseline Ranker Enhancer Three state-of-the-art techniques that enhanced text rankers by term proximity –The t-function t() by [Tao & Zhai, 2007] –The p-function p() by [Cummins & O’Riordan, 2009] –The proximity language model PLM by [Zhao & Yun, 2009]. A Proximity-based Ranker Enhancer16

17 Evaluation Criteria Evaluating how relevant references are ranked higher for users to access –Mean average precision (MAP) –Normalized discount cumulative gain at x (NDCG@X) A Proximity-based Ranker Enhancer17

18 Results A Proximity-based Ranker Enhancer18

19 A Proximity-based Ranker Enhancer19

20 Conclusion A Proximity-based Ranker Enhancer20

21 Term proximity may be comprehensively applied to improving various kinds of text rankers It is helpful to integrate three types of term proximity –Overall proximity –Individual proximity –Collective proximity Term proximity information may be encoded to re-assess TF of each term A Proximity-based Ranker Enhancer21


Download ppt "Proximity-based Ranking of Biomedical Texts Rey-Long Liu * and Yi-Chih Huang * Dept. of Medical Informatics Tzu Chi University Taiwan."

Similar presentations


Ads by Google