Download presentation
Presentation is loading. Please wait.
1
What is the Entrance Exams Task
2
Outline Task description Data Evaluation Some Systems Conclusion
References
3
Task description A new task for evaluating machine reading systems by solving problems from Janpanse university entrance exams. The challenge of this task aims at evaluating systems under the same conditions humans are evaluated to enter the University. [Note that: Reading Comprehension tests are used to assess the degree to which people comprehend what they read , so there is a hypothesis that it is reasonable to use this tests to assess the degree to which a machine "comprehends" what it is reading.]
4
Task description Participants are asked to read a given document and answer questions. Questions are given in multiple-choice format, with several options from which a single answer must be selected. Note that: [1]. Background text collections are not provided, Systems have to answer questions by referring to "common sense knowledge" that high school students who aim to enter the university are expected to have. [2]. Do not restrict question type. Any types of reading comprehesion question in real entrance exams will be included in the test data.
5
Data The data set is extracted from standardized English examinations for university admission in Japan. Exams are created by the Japanese National Center for University Admissions Tests. Original examinations include various style of questions ,such as word filling, grammatical error recognition, sentence filling, etc. However it focus on reading comprehension. Language : English (original), Russian, French, Spanish , Italian and German.
7
Evaluation The task evaluates participating systems by given them a score between 0 and 1 using Systems might obtain higher scores if they leave questions unanswered when they may possibly be wrong. Systems received evaluation scores from two different perspectives: [1]. At the question-answering level: correct answers are counted individually without grouping them. [2]. At the reading-test level: figures both for each reading test as a whole are given.
10
Some Systems Task Registered groups Submitted Runs Participant groups
EE2013 10[27 interest] 5 10 runs EE2014 20 29 runs EE runs: 3 runs(>random); 7 runs(<=random). EE runs: 14 runs(>random); 15 runs(<=random). [1 run(>0.5)]
11
Some Systems Systems Key points and procedures Result JUCS[2013] 0.42
Text-Hypothesis(T-H), Textual Entailment, Answer Ranking [Retrieve relevant sentences(T);Assigning a ranking score to Each T-H pair;Answer Ranking ] 0.42 NIIJ[2013] Text similarity,Textual Entailment [Character Resolver;Retrieve related sentences,T-H; Calculate the T-H confience scores from Textual Entailment;] 0.35 DIPF[2014] Coreference resolution, text similarity, Textual entailment [Retrieve sentences; Textual Entailment on the Text-Hypothesis; Answer Similarity; Answer Selection]. 0.375
12
Some Systems Systems Key points and procedures Result Synapse[2014]
Deep syntactic and semantic analysis, Clause Description Structures(CDS); [Remove some candidate answers; Compute their proximity between documents and candidate answers using CDSs;Rank and select] 0.59[French] 0.45[English] CICNLP[2014] Graph Generation,Cosine Simliarity [Build hypotheses (question with candidate answers); graph representations, linguistic features; cosine simliarity,and ranking] 0.375 LIMSI-CNRS [2014] Semantic relatedness, Alignment, Validation (rules) [Retrieves passages; Alignment of answer PAS (predicate-argument structure) and passage PAS; Validation/Invalidation] 0.25 CSGS[2014] Semantic similarity ;Alignment model [Sentence Selection; Answer Selection] 0.362
13
Conclusion The level of textual inferences that current systems perform is not enough to solve the majority of questions. current systems systems based only on textual similiarity can't address the challenge.(sometimes 越像越不对 )[1]. In order to obtain more than 2/3 of good answers, pragmatic knowledge and inference are essential.[3] The Entrance Exams task shows that Question answering is a task far from being solved.[2]
14
References [1]. Anselmo Peñas, Yusuke Miyao, Eduard Hovy, Pamela Forner and Noriko Kando. Overview of QA4MRE 2013 Entrance Exams Task. In: CLEF(Online Working Notes/Labs/Workshop).(2013) [2]. Anselmo Peñas, Yusuke Miyao, Alvaro Rodrigo, Eduard Hovy, Noriko Kando. Overview of CLEF QA Entrance Exams Task CLEF2014 Working Notes. (2014) [3]. Dominique Laurent, Baptiste Chardon, Sophie Negre and Patrick Seguela. English run of Synapse Développement at Entrance Exams CLEF 2014 Working Notes.(2014)
15
Thank you
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.