Discussion Class 7 Lucene.

Slides:



Advertisements
Similar presentations
Welcome to the New Library Website. The website has been revised to offer: Quick catalogue access to diverse resources and archives in the library Awareness.
Advertisements

Enter question text... 1.Enter answer text.... Enter question text... 1.Enter answer text...
Enter question text... 1.Enter answer text.... Enter question text... 1.Enter answer text...
Realtime Equipment Database F.R.E.D. stands for Fastline’s Realtime Equipment Database. F.R.E.D. will allow you to list all your inventory online. F.R.E.D.
1 Discussion Class 3 The Porter Stemmer. 2 Course Administration No class on Thursday.
1 Discussion Class 2 A Vector Space Model for Automated Indexing.
1 Discussion Class 12 Medical Subject Headings (MeSH) and Unified Medical Language System (UML)
1 Discussion Class 11 Click through Data as Implicit Feedback.
1 CS 430 / INFO 430 Information Retrieval Lecture 2 Searching Full Text 2.
1 Discussion Class 4 Latent Semantic Indexing. 2 Discussion Classes Format: Question Ask a member of the class to answer. Provide opportunity for others.
1 Discussion Class 10 Informedia. 2 Discussion Classes Format: Question Ask a member of the class to answer. Provide opportunity for others to comment.
1 Discussion Class 12 User Interfaces and Visualization.
Lucene Brian Nisonger Feb 08,2006. What is it? Doug Cutting’s grandmother’s middle name Doug Cutting’s grandmother’s middle name A open source set of.
1 Discussion Class 3 Inverse Document Frequency. 2 Discussion Classes Format: Questions. Ask a member of the class to answer. Provide opportunity for.
Modern Information Retrieval Chapter 1 Introduction.
1 Discussion Class 2 A Vector Space Model for Automated Indexing.
1 Discussion Class 6 Crawling the Web. 2 Discussion Classes Format: Questions. Ask a member of the class to answer. Provide opportunity for others to.
1 Discussion Class 8 The Google File System. 2 Discussion Classes Format: Question Ask a member of the class to answer. Provide opportunity for others.
1 Discussion Class 5 TREC. 2 Discussion Classes Format: Questions. Ask a member of the class to answer. Provide opportunity for others to comment. When.
1 Final Discussion Class User Interfaces. 2 Discussion Classes Format: Question Ask a member of the class to answer Provide opportunity for others to.
1 Discussion Class 1 Three Information Retrieval Systems.
Apache Lucene in LexGrid. Lucene Overview High-performance, full-featured text search engine library. Written entirely in Java. An open source project.
MOVIE QUOTES SEARCH ENGINE Students: Meytal Bialik Zvi Cahana Supervisors: Hayim Makabee Oren Somekh Technion – Israel Institute Of Technology Computer.
CaDSR Software Development Update Denise Warzel Semantic Infrastructure Operations Team Presented to caDSR Content team November 2012.
1 CS 430 / INFO 430 Information Retrieval Lecture 2 Text Based Information Retrieval.
IPPOG Database Status Update 20 April, Process of Website Development  Consultant paid by Pathway Project (EU Funds)  Temporary help from Ivan.
1 Discussion Class 9 Thesaurus Construction. 2 Discussion Classes Format: Question Ask a member of the class to answer Provide opportunity for others.
EXTENDING DATABASE USABILITY Michelle Brown, MSc. Student.
1 Discussion Class 4 The Dublin Core Metadata Initiative.
IR Homework #2 By J. H. Wang Mar. 31, Programming Exercise #2: Query Processing and Searching Goal: to search relevant documents for a given query.
1 Discussion Class 8 MARC. 2 Discussion Classes Format: Question Ask a member of the class to answer. Provide opportunity for others to comment. When.
MICROSOFT ACCESS With your host: Daniel McAllister.
The College of Saint Rose CSC 460 / CIS 560 – Search and Information Retrieval David Goldschmidt, Ph.D. from Search Engines: Information Retrieval in Practice,
1 NODC Geoportal Server Yuanjie Li & Jefferson Ogata.
Lesson 13 Databases Unit 2—Using the Computer. Computer Concepts BASICS - 22 Objectives Define the purpose and function of database software. Identify.
DSpace System Architecture 11 July 2002 DSpace System Architecture.
ACT476 CAPSTONE WRITING AN USER MANUAL. Developers VS Users Developers want to write code Have little time to document or write user’s manuals Users on.
1 Discussion Class 1 Three Information Retrieval Systems.
1 Discussion Class 3 Stemming Algorithms. 2 Discussion Classes Format: Question Ask a member of the class to answer Provide opportunity for others to.
1 Discussion Class 1 Inverted Files. 2 Discussion Classes Format: Question Ask a member of the class to answer Provide opportunity for others to comment.
1 Discussion Class 10 Thesaurus Construction. 2 Discussion Classes Format: Question Ask a member of the class to answer Provide opportunity for others.
A Faceted Interface to the Library Catalog Tito Sierra NCSU Libraries ALA Midwinter Meeting January 20, 2007.
IR Homework #2 By J. H. Wang Apr. 13, Programming Exercise #2: Query Processing and Searching Goal: to search for relevant documents Input: a query.
1 Discussion Class 2 A Vector Space Model for Automated Indexing.
SIMS 202, Marti Hearst Final Review Prof. Marti Hearst SIMS 202.
Business Searching Interface
Search Engine Architecture
Opposing Viewpoints
Subject : Computer II Dept. Computer Science (Year 4)
CS 430: Information Discovery
CS 430: Information Discovery
WEL-COME Facebook Technical Support Phone Number CallNow toll free Call Now : ( toll free ) For More Information visit on.
Opposing Viewpoints
1 2 3 Here we are on the Ohio Web Library’s home page. To get to Business Source Premier, use the following steps: 1. Go to Ohio Web Library 2. Click on.
Business Searching Interface
Chapter 9 Database and Information Management.
This presentation has been prepared by Vault Intelligence Limited (“Vault") and is intended for off line demonstration, presentation and educational purposes.
Discuss Fitzgerald’s view of 1920’s American culture, paying special attention to color symbolism as it relates to character.
The ultimate in data organization
Access Tutorial 4 Creating Forms and Reports
Relevance Feedback and Query Modification
Grauer and Barber Series Microsoft Access Chapter One
Discussion Class 3 Stemming Algorithms.
Remember Each listening test is 15 questions
Introduction to information retrieval
Discussion Class 9 Google.
Discussion Class 9 Informedia.
Discussion Class 7 User Requirements.
Discussion Class 8 User Interfaces.
Presentation transcript:

Discussion Class 7 Lucene

Discussion Classes Format: Question Ask a member of the class to answer. Provide opportunity for others to comment. When answering: Stand up. Give your name. Make sure that the TA hears it. Speak clearly so that all the class can hear. Suggestions: Do not be shy at presenting partial answers. Differing viewpoints are welcome.

Question 1 How did you go about looking for information about Lucene? Did you search or browse? What information sources did you find most useful? Is there some information that you were unable to find? (b) Who created Lucene? What is the business model?

Question 2 (a) What are the underlying search mechanisms supported by Lucene? (b) What algorithms does it use? What data structures does it use? (i) To store and index fielded data (ii) To maintain term weights

Question 3 (a) How do you load free text into Lucene? (b) How do you load fielded text? (c) What format options are there? (d) How does it handle various character sets, stoplists, stemming, etc.?

Question 4 How do you incorporate Lucene queries and results into your own user interface?

Question 5 If you wanted to modify Lucene to support a novel search algorithm, how would you go about it?