Download presentation
Presentation is loading. Please wait.
Published byCori Tate Modified over 8 years ago
1
1 INFILE - INformation FILtering Evaluation Evaluation of adaptive filtering systems for business intelligence and technology watch Towards real use conditions : Automatic interrogations of systems from the organizer's site Boolean decision (document relevant to the topic or not) Simulated feed back from the user on the system response (on the only documents considered positive by the system) No training on topic before the beginning of the test
2
2 INFILE - INformation FILtering Evaluation Protocol of experimentation : Organizer sends a document to the tested system The system sends a positive answer if the document is considered relevant for a topic Organizer sends a negative feedback on answers that are incorrectly given positive, other answers are considered correct Each system can use the positive and negative feedback to adapt their system and topic description A Dry Run will be organized to check that everything works well
3
3 INFILE - INformation FILtering Evaluation Data collections : -AFP news (Agence France Press) in French, English and Arabic. -corpus available in 19 languages. In next campaigns new languages could be added. about 100 000 documents for each language Topics : A minimum of 25 topics composed of : - a list of keywords (noun phrases) - one to 3 relevant passages of documents
4
4 INFILE - INformation FILtering Evaluation Ground truth : We don’t want to use data that have been already used for an ad hoc campaign. So we have no possibility of pooling before the beginning of the test. Corpus is produced with the AFP users by merging documents known as relevant, document that can bring confusion with relevant documents and a large number of non relevant documents. A minimum pooling will be done at the end to adjust the evaluations.
5
5 INFILE - INformation FILtering Evaluation Evaluations : Measures benefit from the TREC filtering task experience Evolution of the precision, recall, linear utility and F-beta according to the time. They will be computed for example every 30000 documents. Moreover, mean precision, recall, linear utility and F-beta will be computed for the total run at the end We are open for discussions about metrics.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.