Information and Communication Technologies 1 Overview of GeoCLEF 2007 IR techniques IE/NLP techniques GIR techniques Systems Resources Experiments Translation.

Slides:



Advertisements
Similar presentations
Topic Identification in Forums Evaluation Strategy IA Seminar Discussion Ahmad Ammari School of Computing, University of Leeds.
Advertisements

Improvements and extras Paul Thomas CSIRO. Overview of the lectures 1.Introduction to information retrieval (IR) 2.Ranked retrieval 3.Probabilistic retrieval.
Improving Machine Translation Quality with Automatic Named Entity Recognition Bogdan Babych Centre for Translation Studies University of Leeds, UK Department.
SINAI-GIR A Multilingual Geographical IR System University of Jaén (Spain) José Manuel Perea Ortega CLEF 2008, 18 September, Aarhus (Denmark) Computer.
Disambiguating Queries for Geographic Information Retrieval Carolyn Hafernik Thesis Proposal May 10, 2006 Computer Science Advisor: Lisa Ballesteros.
The XLDB Group at GeoCLEF 2005 Nuno Cardoso, Bruno Martins, Marcírio Chaves, Leonardo Andrade, Mário J. Silva
Information Retrieval in Practice
IR Challenges and Language Modeling. IR Achievements Search engines  Meta-search  Cross-lingual search  Factoid question answering  Filtering Statistical.
Research Paper Presentation – CS572 Summer 2011 Presented by Donghee Sung Paper by Paul Clough (University of Sheffield Western Bank)
OntoNotes/PropBank Participants: BBN, Penn, Colorado, USC/ISI.
Information Retrieval: Models and Methods October 15, 2003 CMSC Gina-Anne Levow.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
Toward Semantic Web Information Extraction B. Popov, A. Kiryakov, D. Manov, A. Kirilov, D. Ognyanoff, M. Goranov Presenter: Yihong Ding.
 Ad-hoc - This track tests mono- and cross- language text retrieval. Tasks in 2009 will test both CL and IR aspects.
Siemens Big Data Analysis GROUP 3: MARIO MASSAD, MATTHEW TOSCHI, TYLER TRUONG.
Information Retrieval in Practice
Longbiao Kang, Baotian Hu, Xiangping Wu, Qingcai Chen, and Yan He Intelligent Computing Research Center, School of Computer Science and Technology, Harbin.
LREC Combining Multiple Models for Speech Information Retrieval Muath Alzghool and Diana Inkpen University of Ottawa Canada.
Latent Semantic Analysis Hongning Wang VS model in practice Document and query are represented by term vectors – Terms are not necessarily orthogonal.
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
CSC 9010 Spring Paula Matuszek A Brief Overview of Watson.
CLEF Ǻrhus Robust – Word Sense Disambiguation exercise UBC: Eneko Agirre, Oier Lopez de Lacalle, Arantxa Otegi, German Rigau UVA & Irion: Piek Vossen.
“How much context do you need?” An experiment about context size in Interactive Cross-language Question Answering B. Navarro, L. Moreno-Monteagudo, E.
1 The Domain-Specific Track at CLEF 2008 Vivien Petras & Stefan Baerisch GESIS Social Science Information Centre, Bonn, Germany Aarhus, Denmark, September.
Classification Technology at LexisNexis SIGIR 2001 Workshop on Operational Text Classification Mark Wasson LexisNexis September.
GLOSSARY COMPILATION Alex Kotov (akotov2) Hanna Zhong (hzhong) Hoa Nguyen (hnguyen4) Zhenyu Yang (zyang2)
CLEF 2004 – Interactive Xling Bookmarking, thesaurus, and cooperation in bilingual Q & A Jussi Karlgren – Preben Hansen –
A Two Tier Framework for Context-Aware Service Organization & Discovery Wei Zhang 1, Jian Su 2, Bin Chen 2,WentingWang 2, Zhiqiang Toh 2, Yanchuan Sim.
“ SINAI at CLEF 2005 : The evolution of the CLEF2003 system.” Fernando Martínez-Santiago Miguel Ángel García-Cumbreras University of Jaén.
Ontology-Driven Automatic Entity Disambiguation in Unstructured Text Jed Hassell.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Information Retrieval and Web Search Cross Language Information Retrieval Instructor: Rada Mihalcea Class web page:
© Paul Buitelaar – November 2007, Busan, South-Korea Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas.
This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number.
MIRACLE Multilingual Information RetrievAl for the CLEF campaign DAEDALUS – Data, Decisions and Language, S.A. Universidad Carlos III de.
Péter Schönhofen – Ad Hoc Hungarian → English – CLEF Workshop 20 Sep 2007 Performing Cross-Language Retrieval with Wikipedia Participation report for Ad.
Latent Semantic Analysis Hongning Wang Recap: vector space model Represent both doc and query by concept vectors – Each concept defines one dimension.
Extracting Metadata for Spatially- Aware Information Retrieval on the Internet Clough, Paul University of Sheffield, UK Presented By Mayank Singh.
Generic text summarization using relevance measure and latent semantic analysis Gong Yihong and Xin Liu SIGIR, April 2015 Yubin Lim.
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
By: Namrata Lele Mentors: Dave Vieglais Bruce Wilson 1 VDC/TWG Meeting August 09.
CLEF 2008 Workshop September 17-19, 2008 Aarhus, Denmark.
CLEF Kerkyra Robust – Word Sense Disambiguation exercise UBC: Eneko Agirre, Arantxa Otegi UNIPD: Giorgio Di Nunzio UH: Thomas Mandl.
Techniques for Collaboration in Text Filtering 1 Ian Soboroff Department of Computer Science and Electrical Engineering University of Maryland, Baltimore.
Conceptual structures in modern information retrieval Claudio Carpineto Fondazione Ugo Bordoni
1 Masters Thesis Presentation By Debotosh Dey AUTOMATIC CONSTRUCTION OF HASHTAGS HIERARCHIES UNIVERSITAT ROVIRA I VIRGILI Tarragona, June 2015 Supervised.
Semi-automatic Product Attribute Extraction from Store Website
Stiftung Wissenschaft und Politik German Institute for International and Security Affairs CLEF 2005: Domain-Specific Track Overview Michael Kluck SWP,
INAOE at GeoCLEF 2008: A Ranking Approach based on Sample Documents Esaú Villatoro-Tello Manuel Montes-y-Gómez Luis Villaseñor-Pineda Language Technologies.
CS798: Information Retrieval Charlie Clarke Information retrieval is concerned with representing, searching, and manipulating.
(Pseudo)-Relevance Feedback & Passage Retrieval Ling573 NLP Systems & Applications April 28, 2011.
Analysis of Experiments on Hybridization of different approaches in mono and cross-language information retrieval DAEDALUS – Data, Decisions and Language,
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Hui Fang (ACL 2008) presentation 2009/02/04 Rick Liu.
Concept-Based Analysis of Scientific Literature Chen-Tse Tsai, Gourab Kundu, Dan Roth UIUC.
Information Retrieval and Extraction 2009 Term Project – Modern Web Search Advisor: 陳信希 TA: 蔡銘峰&許名宏.
From Frequency to Meaning: Vector Space Models of Semantics
CLEF Budapest1 Measuring the contribution of Word Sense Disambiguation for QA Proposers: UBC: Agirre, Lopez de Lacalle, Otegi, Rigau, FBK: Magnini.
Multilingual Search using Query Translation and Collection Selection Jacques Savoy, Pierre-Yves Berger University of Neuchatel, Switzerland
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Sentiment analysis algorithms and applications: A survey
6 ~ GIR.
Experiments for the CL-SR task at CLEF 2006
Multimedia Information Retrieval
Mining and Analyzing Data from Open Source Software Repository
Matching Words with Pictures
CSE 635 Multimedia Information Retrieval
How to publish in a format that enhances literature-based discovery?
Towards Solving Problems Using Textual Entailment over Identified Sources 2015/9/14 Weixi Zhu.
Cheshire at GeoCLEF 2008: Text and Fusion Approaches for GIR
Presentation transcript:

Information and Communication Technologies 1 Overview of GeoCLEF 2007 IR techniques IE/NLP techniques GIR techniques Systems Resources Experiments Translation General comments

Information and Communication Technologies 2 IR techniques and systems automatic and manual query expansion blind relevance feedback INL2 term weighting model divergence from randomness framework latent Dirichlet allocation model logistic regression query decomposition vector space model stemming systems: Lemur (2), Lucene (4), MG4J, Terrier (2), Zebra DBMS

Information and Communication Technologies 3 NLP techniques and systems named entity recognition WordNet-based expansion semantic analysis part-of-speech tagging use of a QA system for subqueries decompounding (for German) WordNet lemmatizer systems: LingPipe, Annie (GATE)

Information and Communication Technologies 4 GIR/GIE techniques location disambiguation geographic unique strings location normalization query expansion based on geographical terms query expansion based on a geographic ontology heuristic geographically informed filtering removing candidates using shape files for close or near geographical relations separate geographical indexes geographic cooccurrence model (based on Wikipedia) geographic relation finder

Information and Communication Technologies 5 GIR/GIE resources geonames gazetteer World Gazetteer Getty TGN GNIS GeoWorldMap World gazetteer ADL Feature Type thesaurus GKB 2.0 Spanish toponymy list plus Wikipedia and WordNet

Information and Communication Technologies 6 Translation (only queries) Systran (PT-EN) Prompt (ES-EN) Promt (EN-DE) LEC Power Translator (all possible pairs) transfer MT system (ES-EN, ES-PT) Toggletext (ID-EN)

Information and Communication Technologies 7 Experiments In addition to the particular experiments of each group concerning their original approaches Separate indexes for geographic and non-geographic information Apply different techniques to different parts of the query Use T, TD or TDN

Information and Communication Technologies 8 Comments Very few papers mention distinguished treatment for different kinds of topics although some discuss differences between topics only one provides some per-topic analysis Most participants have best results with text only! Most participants use NER, but no NER evaluation in itself has been reported: does it really help? or does it also diminish performance? they don’t use the output of NER the same way: wide difference in behaviour after NER has been invoked