Lemur Indri Search Engine Yatish Hegde 03/03/2010.

Slides:



Advertisements
Similar presentations
International Household Survey Network Metadata Toolkit Trevor Croft MICS3 Data Archiving, Dissemination and Further Analysis Workshop Geneva - November.
Advertisements

Topic - DATA PROVIDERS TRAINING COURSE DEPLOYMENT PROCESS MINCyT, Buenos Aires, Argentina, 7 – 11 October 2013.
Samsung Smart TV is a web-based application running on an application engine installed on digital TVs connected to the Internet.
Training Agenda  Overview  The user interface (UI)  Using a query from the Pinnacle support website
Company Confidential 1 © 2005 Nokia DBUpgradeTool_ ppt / / JMa A Database Upgrade Tool Nokia Networks Jukka Maaranen.
ARIADNE V4 Filip Neven, Stefaan Ternier & Erik Duval Dept. Computerwetenschappen, Katholieke Universiteit Leuven, Belgium
Test Automation Framework Ashesh Jain 2007EE50403 Manager Amit Maheshwari.
Compass Semantic search
FUNDAMENTALS OF PROGRAMMING SM1204 Semester A 2010/2011.
Project Title: Deepin Search Member: Wenxu Li & Ziming Zhai CSCI 572 Project.
Technical Tips and Tricks for User Support Mike Gardner
Evaluating the Performance of IR Sytems
Introduction Web Development II 5 th February. Introduction to Web Development Search engines Discussion boards, bulletin boards, other online collaboration.
WWW and Internet The Internet Creation of the Web Languages for document description Active web pages.
Process, Communication, and Certification Padma Venkata
PHP Scripting Language. Introduction “PHP” is an acronym for “PHP: Hypertext Preprocessor.” It is an interpreted, server-side scripting language. Originally.
Dynamic Web site With PHP and MySQL. MySQL The combination of MySQL database and PHP scripting language is optimum for building dynamic websites. MySQL.
Python for S60 SmartPhones PostPC Workshop Fall 2006 Amnon Dekel.
Xpantrac connection with IDEAL Sloane Neidig, Samantha Johnson, David Cabrera, Erika Hoffman CS /6/2014.
Lemur Toolkit Tutorial
M. Taimoor Khan * Java Server Pages (JSP) is a server-side programming technology that enables the creation of dynamic,
CiviCRM - Advanced Topics Dave Greenberg Michal Mach
DEF System Architecture XML Web Services Fedora and the Zebra Search Engine in an OAI Eprints Application by Gert Schmeltz Pedersen, DTV
Web Page Introduction. What is a web page? A web page is a text file containing markup language tags. –A markup language combines text and extra information.
FUNDAMENTALS OF PROGRAMMING SM1204 SEMESTER A 2012.
GRITS Working with AVM Data Astronomy Visualization Metadata June 11th, 2010 Casey Rosenthal
Terrier: TERabyte RetRIevER An Introduction By: Kavita Ganesan (Last Updated April 21 st 2009)
Wikis are websites where pages can be edited using an online document editor. Users can easily edit and share content. Enterprise wikis are platforms.
Practical Project of the 2006 Joint International Master’s Degree.
Inside Crystal Reports 7 for DataFlex T.M. Arnett Training Specialist.
HTML. Principle of Programming  Interface with PC 2 English Japanese Chinese Machine Code Compiler / Interpreter C++ Perl Assembler Machine Code.
University of North Texas Libraries Building Search Systems for Digital Library Collections Mark E. Phillips Texas Conference on Digital Libraries May.
Putting Applets into Web Pages.  Two things are involved in the process of putting applets onto web pages ◦ The.class files of the applet ◦ The html.
Qatar Content Classification Presenter Mohamed Handosa VT, CS6604 March 6, 2014 Client Tarek Kanan 1.
25/10/20151Gianluca Demartini Desktop Search Evaluation Sergey Chernov and Gianluca Demartini TREC 2006, 16th November 2006 Pre-Track Workshop.
META tag META tag is the element in the HTML that interacts with the search engines. It’s contain 2 attributes that should always be used: NAME: is an.
Lucene-Demo Brian Nisonger. Intro No details about Implementation/Theory No details about Implementation/Theory See Treehouse Wiki- Lucene for additional.
Javadoc A very short tutorial. What is it A program that automatically generates documentation of your Java classes in a standard format For each X.java.
Searching CiteSeer Metadata Using Nutch Larry Reeve INFO624 – Information Retrieval Dr. Lin – Winter 2005.
The World Wide Web: Information Resource. Hock, Randolph. The Extreme Searcher’s Internet Handbook. 2 nd ed. CyberAge Books: Medford. (2007). Internet.
XML and Its Applications Ben Y. Zhao, CS294-7 Spring 1999.
JavaDoc and Contracts Spring Documenting Contracts with JavaDoc Contract model for methods Preconditions Postconditions JavaDoc Industry standard.
Working with Feature Services Gary MacDougall Russell Brennan.
IA Tools to Inform IA Summit 2003 Madonnalisa G. Chan.
JISC/NSF PI Meeting, June Archon - A Digital Library that Federates Physics Collections with Varying Degrees of Metadata Richness Department of Computer.
The World Wide Web: Information Resource. How a Search Engine works… How Search Works - YouTube
A wiki is a collaborative web application which allows people to add and edit content using a browser… …it creates communities and empowers users as they.
How to combine IRIS products Available APIs Examples of integrations Ole Andersen Senior Strategic Account Manager.
HalFILE 2.1 Planned Features / User Feedback Session II.
Don’t Duck Metadata March 2005 Introducing Setting Up a Clearinghouse Node Topic: Introduction to Setting Up a Clearinghouse Node Objective: By.
Lucene Jianguo Lu.
Document Clustering for Natural Language Dialogue-based IR (Google for the Blind) Antoine Raux IR Seminar and Lab Fall 2003 Initial Presentation.
Unity Application Generator How Can I… Export variables of a Control module, modify the Initial values and import the list back into UAG.
Expertsfromindia for Joomla Development. Introduction Joomla is an open source and free content management system (CMS) for publishing content on the.
The Brenkoweb provides the excellent online programming tutorial for the programmer in various languages like as PHP, SQL, HTML, ASP, Javascript,
Information Retrieval and Extraction 2009 Term Project – Modern Web Search Advisor: 陳信希 TA: 蔡銘峰&許名宏.
Introduction to Android Programming
HedEx Lite Obtaining and Using Huawei Documentation Easily
Big Data Analytics and Machine Intelligence Capability Team
Mapping for the interwebs
Web UI Basics ITM 352.
Jekyll Documentation Theme
Building Search Systems for Digital Library Collections
CFD Trading in Australia
A simple way to configure PHP and Apache for Assignment 2
Overview of the SCIRun/BioPSE Software Systems
JavaDoc and Contracts Fall 2008.
Getting Started With Solr
Lab 2: Information Retrieval
Metadata supported full-text search in a web archive
Presentation transcript:

Lemur Indri Search Engine Yatish Hegde 03/03/2010

Background Open source text search engine Combines language modeling and inference networks Inquery query language API – accesible from C++, Java, C# and PHP. Html, xml, txt, trectext, trecweb, ppt, doc*, ppt*

Resources Website: Tutorials: Forum:

How to get started? Cygwin: (include “perl”, “vi editor” and “make” package while installing) Lemur Toolkit: p p TREC Eval:

Installing Lemur Inside Lemur Directory -./configure make make install Build Index – IndriBuildIndex Run Query - IndriRunQuery

Building Index IndriBuildIndex /home/lemur/testindex 1G /home/lemur/testdata/firstCorpus trectext /home/lemur/testdata/secondCorpus trecweb krovetz p

Running Query IndriRunQuery Query File 701 oil industry history Stop Word File the Query Options File true /path/to/index 1000

Converting Topic File into Query File Topic File Number: 301 International Organized Crime Description: Identify organizations that participate in international criminal activity, the activity, and, if possible, collaborating organizations and the countries involved. Narrative: A relevant document must as a minimum identify the organization and the type of illegal activity (e.g., Columbian cartel exporting cocaine). Vague references to international drug trade without identification of the organization(s) involved would not be relevant.

Converting Topic File into Query File Perl Program:./topicToQuery.pl [-t] [-d]./topicToQuery.pl -h

TREC Eval make trec_eval -q -c -M1000 official_qrels query_results More Documentation: README README

Lemur Search UI User Interface: The%20Lemur%20CGI%20Application The%20Lemur%20CGI%20Application How it looks?

Indri Query Langauge #combine( white house) #1(white house) #5(white house) #band(white house) #band(oil fields) #1(white house) 301 #combine( Identify organizations that participate in #max( #1( international criminal activity) international criminal activity ) the activity and if possible collaborating organizations and the countries involved)

Contact If you have questions - Yatish Hegde:

Thank You