Lemur Indri Search Engine Yatish Hegde 03/03/2010
Background Open source text search engine Combines language modeling and inference networks Inquery query language API – accesible from C++, Java, C# and PHP. Html, xml, txt, trectext, trecweb, ppt, doc*, ppt*
Resources Website: Tutorials: Forum:
How to get started? Cygwin: (include “perl”, “vi editor” and “make” package while installing) Lemur Toolkit: p p TREC Eval:
Installing Lemur Inside Lemur Directory -./configure make make install Build Index – IndriBuildIndex Run Query - IndriRunQuery
Building Index IndriBuildIndex /home/lemur/testindex 1G /home/lemur/testdata/firstCorpus trectext /home/lemur/testdata/secondCorpus trecweb krovetz p
Running Query IndriRunQuery Query File 701 oil industry history Stop Word File the Query Options File true /path/to/index 1000
Converting Topic File into Query File Topic File Number: 301 International Organized Crime Description: Identify organizations that participate in international criminal activity, the activity, and, if possible, collaborating organizations and the countries involved. Narrative: A relevant document must as a minimum identify the organization and the type of illegal activity (e.g., Columbian cartel exporting cocaine). Vague references to international drug trade without identification of the organization(s) involved would not be relevant.
Converting Topic File into Query File Perl Program:./topicToQuery.pl [-t] [-d]./topicToQuery.pl -h
TREC Eval make trec_eval -q -c -M1000 official_qrels query_results More Documentation: README README
Lemur Search UI User Interface: The%20Lemur%20CGI%20Application The%20Lemur%20CGI%20Application How it looks?
Indri Query Langauge #combine( white house) #1(white house) #5(white house) #band(white house) #band(oil fields) #1(white house) 301 #combine( Identify organizations that participate in #max( #1( international criminal activity) international criminal activity ) the activity and if possible collaborating organizations and the countries involved)
Contact If you have questions - Yatish Hegde:
Thank You