1Micheal T. Adenibuyan, 2Oluwatoyin A. Enikuomehin and 2Benjamin S

Slides:



Advertisements
Similar presentations
What is a review? An article which looks at a question or subject and seeks to summarise and bring together evidence on a health topic.
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Introduction to Information Retrieval
Chapter 5: Introduction to Information Retrieval
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
Effective Keyword Based Selection of Relational Databases Bei Yu, Guoliang Li, Karen Sollins, Anthony K.H Tung.
A Machine Learning Approach for Improved BM25 Retrieval
Search Engines Information Retrieval in Practice All slides ©Addison Wesley, 2008.
Cumulative Progress in Language Models for Information Retrieval Antti Puurula 6/12/2013 Australasian Language Technology Workshop University of Waikato.
Introduction to Meta-Analysis Joseph Stevens, Ph.D., University of Oregon (541) , © Stevens 2006.
1 Question Answering in Biomedicine Student: Andreea Tutos Id: Supervisor: Diego Molla.
Evaluating Search Engine
Retrieval Evaluation. Brief Review Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
Reference Collections: Task Characteristics. TREC Collection Text REtrieval Conference (TREC) –sponsored by NIST and DARPA (1992-?) Comparing approaches.
Retrieval Evaluation: Precision and Recall. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity.
Bibliometrics overview slides. Contents of this slide set Slides 2-5 Various definitions Slide 6 The context, bibliometrics as 1 tools to assess Slides.
Retrieval Evaluation. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
Information Retrieval
Chapter 5: Information Retrieval and Web Search
ⓒ UNIST LIBRARY UNIST Institutional Repository ⓒ UNIST LIBRARY
DIVINES – Speech Rec. and Intrinsic Variation W.S.May 20, 2006 Richard Rose DIVINES SRIV Workshop The Influence of Word Detection Variability on IR Performance.
Background Information : Projected prevalence of arthritis is expected to increase from 2.9 million to 6.5 million Canadians, a rise of 124% (Badley.
Systematic Approaches to Literature Reviewing. The Literature Review ? “Literature reviews …… introduce a topic, summarise the main issues and provide.
Social Networking Techniques for Ranking Scientific Publications (i.e. Conferences & journals) and Research Scholars.
Bibliometrics toolkit: ISI products Website: Last edited: 11 Mar 2011 Thomson Reuters ISI product set is the market leader for.
Program Evaluation. Program evaluation Methodological techniques of the social sciences social policy public welfare administration.
Rajesh Singh Deputy Librarian University of Delhi Measuring Research Output.
Reference Databases Guides 2010 ISI Web of Science IEEE ACM SpringerLink Wilson Proquest Wiley-Blackwell ScienceDirect Scopus SciFinder.
1 Formal Models for Expert Finding on DBLP Bibliography Data Presented by: Hongbo Deng Co-worked with: Irwin King and Michael R. Lyu Department of Computer.
Searching the Web Dr. Frank McCown Intro to Web Science Harding University This work is licensed under Creative Commons Attribution-NonCommercial 3.0Attribution-NonCommercial.
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
THOMSON SCIENTIFIC Patricia Brennan Thomson Scientific January 10, 2008.
Measuring Value and Outcomes of Reading Dr. Carol Tenopir University of Tennessee
ICOLIS 2007 AN ATTEMPT TO MAP INFORMATION LITERACY SKILLS VIA CITATION ANALYSIS OF FINAL YEAR PROJECT REPORTS N.N. Edzan Library and Information Science.
Chapter 6: Information Retrieval and Web Search
Evaluation INST 734 Module 5 Doug Oard. Agenda Evaluation fundamentals Test collections: evaluating sets  Test collections: evaluating rankings Interleaving.
Bibliometrics toolkit Website: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Further info: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Scopus Scopus was launched by Elsevier in.
Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.
From description to analysis
Measuring How Good Your Search Engine Is. *. Information System Evaluation l Before 1993 evaluations were done using a few small, well-known corpora of.
Performance Measures. Why to Conduct Performance Evaluation? 2 n Evaluation is the key to building effective & efficient IR (information retrieval) systems.
Information Retrieval
Advantages of Query Biased Summaries in Information Retrieval by A. Tombros and M. Sanderson Presenters: Omer Erdil Albayrak Bilge Koroglu.
Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
1 Evaluating High Accuracy Retrieval Techniques Chirag Shah,W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science.
CiteSearch: Multi-faceted Fusion Approach to Citation Analysis Kiduk Yang and Lokman Meho Web Information Discovery Integrated Tool Laboratory School of.
Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:
Information Retrieval Lecture 3 Introduction to Information Retrieval (Manning et al. 2007) Chapter 8 For the MSc Computer Science Programme Dell Zhang.
Introduction to Information Retrieval Introduction to Information Retrieval Lecture 10 Evaluation.
Google Scholar Google Scholar allows the researcher to search for scholarly articles on a broad range of subjects.
Databases Post-Graduate Workshop 2011 Letitia Lekay.
Sampath Jayarathna Cal Poly Pomona
Bibliometrics toolkit: Thomson Reuters products
TJTS505: Master's Thesis Seminar
Evaluation of IR Systems
An Empirical Study of Learning to Rank for Entity Search
Supplementary Table 1. PRISMA checklist
Ming-Yao Chen#, Wen-Ta Chiu and Yuh-Shan Ho*
Writing a Research Abstract
Performance Measurement and Rural Primary Care: A scoping review
Bibliometric Analysis of Water Research
Research Methods Project
STROBE Statement revision
H676 Meta-Analysis Brian Flay WEEK 1 Fall 2016 Thursdays 4-6:50
By: Azrul Abdullah Waeibrorheem Waemustafa Hamdan Mat Isa Universiti Teknologi Mara, Perlis Branch, Arau Campus SEFB, Universiti Utara, Malaysia Disclosure.
IR Theory: Evaluation Methods
Bibliometric Analysis of Desflurane Research
EERQI Innovative Indicators and Test Results
Citation databases and social networks for researchers: measuring research impact and disseminating results - exercise Elisavet Koutzamani
Presentation transcript:

Information Retrieval Metrics for Speech Based Systems: A Systematic Review 1Micheal T. Adenibuyan, 2Oluwatoyin A. Enikuomehin and 2Benjamin S. Aribisala 1Department of Computer Science & Information Technology College of Natural and Applied Sciences Bells University of Technology Ota, Ogun State, Nigeria   2Department of Computer Sciences Faculty of Science Lagos State University Ojo, Lagos, Nigeria

Outline Introduction Methodology Study Selection and Data Collection Process Results Study Characteristics IR Evaluation Metrics Summary Limitations Conclusion

Introduction Information overload. Spoken query. Performance and efficiency raise interesting research issues in IR. Spoken query impact retrieval performance (Turunen, Kurimo et al. 2013).

TOTAL RESULTS OBTAINED Methodology We used Google scholar repository to retrieve some selection of peer-reviewed articles. Resources checked includes: conference proceedings, journals, and books. RESOURCE TOTAL RESULTS OBTAINED INITIAL SELECTION FINAL SELECTION Google Scholar 179 32 25 Total

Study Selection and Data Collection Process This systematic review of articles was conducted using the Preferred Reporting Item for Systematic Reviews and Meta-Analyses (PRISMA) guidelines . PRISMA guideline helps to establish studies that were included and those excluded for this research.

PRISMA

Inclusion Criteria Studies on spoken queries for information retrieval and metrics used for evaluating the adhoc retrieval systems were reviewed. Criterias: Spoken query Evaluation Metrics, Performance. Research from notable IR conferences such as TREC (Text REtrieval Conference), CLEF.

Exclusion Criteria Studies that only report ASR (Automatic Speech Recognition) or speech recognition. Studies that focus on retrieval technique without evaluating it.

Publication Quality Assessment To assess the quality of each publication, the following measures were set to avoid sway. Was an experiment done to retrieve document using spoken queries? Was there any evaluation done? Is the corpus source known?

Coding Scheme We extracted the following data from each publication in our dataset as shown in the table below. Class Item Publication Year, Source Contributors Name and affiliation, Country Study purpose Research questions, hypotheses and the use of theory Objectives Method Subjects, Corpus, Search task Measures Output measures Cited works Genres, Years, Source titles, Item titles, Cited authors

Data Synthesis

Results 8 studies were on Mean Average Precision (MAP) 1 study applied Average Precision and R-Precision 1 applied a combination of Precision, Recall and F-Measure 1 combined Average Inverse Rank (AIR) and MAP 1 combined Average Precision and Recall 1 combined Precision and Recall 2 adopted F-measure and MAP 1 applied the Generalised Average Precision (GAP) and MAP 1 applied Precison Recall Curve, F-Measure and MAP 1 used Precision, Recall and MAP 1 combined Normalized Discounted Cummulative Gain (NCDG), Mean Reciprocal Rank (MRR) and MAP.

Included Studies Characteristics Number of included studies (n=25) Median year of Publication 2005 (1998-2015) Country of Publication Canada China Finland Ireland Germany UK United States of America (USA) Japan Singapore Thailand, Indonesia, China 2 6 1 3 4

Studies Characteristics Cont’d Majority of the articles (60%; n = 15) were published between 2007 and 2017, about 40% (n = 10) were published between 1997 – 2006. 36% of the articles were from four publications: Speech Communication, IEEE Transactions on Audio, Speech & Language, Acoustics Speech & Signal Processing ICASSP, NTCIR workshop, while the remaining 64% were scattered across sixteen (16) different publications

Distribution of publications across Sources

IR Evaluation Metrics Precision and Recall Mean Average Precision (MAP) Average Inverse Rank (AIR) / Mean Reciprocal Rank (MRR) Normalized Discounted Cumulative Gain (NDCG)

Precision and Recall Traditional IR evaluation metric

Mean Average Precision (MAP)

Mean Average Precision (MAP) This is the most used evaluation metric for an IR System performance. It account for 60% of publication studied for this review. It ranks relevant documents higher than irrelevant documents MAP result range from 0.4191 to 0.620 with relative improvement in retrieval performance by 13.8%.

Mean Reciprocal Rank Also known as Average Inverse Rank (AIR) It achieved as much as 14% improvement in IR performance with range between 0.747 to 0.826

Normalized Discounted Cumulative Gain (NDCG) It returns the document with the highest relevance level, then the next highest relevance level. The higher the better Normalization for contrasting queries with varying numbers of relevant results. 4% of publications studied applied the NDCG metric to measure IR performance. Result range from 0.089 to 0.284.

Conclusion This review has documented the evolution of speech based IR evaluation metrics. One important outcome of this research is the speech based IR metric research bibliography.