Submitted By: Usha MIT-876-2K11 M.Tech(3rd Sem) Information Technology

Slides:



Advertisements
Similar presentations
T.Sharon-A.Frank 1 Internet Resources Discovery (IRD) Shopping Agents.
Advertisements

CAPTURE SOFTWARE Please take a few moments to review the following slides. Please take a few moments to review the following slides. The filing of documents.
CAPTURE SOFTWARE Please take a few moments to review the following slides. Please take a few moments to review the following slides. The filing of documents.
Digital Libraries and Autonomous Citation Indexing Steve Lawrence C. Lee Giles Kurt Bollacker.
Information Retrieval in Practice
T.Sharon - A.Frank 1 Internet Resources Discovery (IRD) Classic Information Retrieval (IR)
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. 1 The Architecture of a Large-Scale Web Search and Query Engine.
Visual Web Information Extraction With Lixto Robert Baumgartner Sergio Flesca Georg Gottlob.
FACT: A Learning Based Web Query Processing System Hongjun Lu, Yanlei Diao Hong Kong U. of Science & Technology Songting Chen, Zengping Tian Fudan University.
Internet Resources Discovery (IRD) IBM DB2 Digital Library Thanks to Zvika Michnik and Avital Greenberg.
Introducing Symposia : “ The digital repository that thinks like a librarian”
Overview of Search Engines
 Search engines are programs that search documents for specified keywords and returns a list of the documents where the keywords were found.  A search.
Managing your References Sue Bird Bodleian Bio- & Environmental Sciences October 2010.
CS621 : Seminar-2008 DEEP WEB Shubhangi Agrawal ( )‏ Jayalekshmy S. Nair ( )‏
Information Need Question Understanding Selecting Sources Information Retrieval and Extraction Answer Determina tion Answer Presentation This work is supported.
Online Autonomous Citation Management for CiteSeer CSE598B Course Project By Huajing Li.
A Comparison of On-line Computer Science Citation Databases Vaclav Petricek, Ingemar J. Cox, Hui Han, Isaac G. Councill, C. Lee Giles
Search Engine By Bhupendra Ratha, Lecturer School of Library and Information Science Devi Ahilya University, Indore
RCDL Conference, Petrozavodsk, Russia Context-Based Retrieval in Digital Libraries: Approach and Technological Framework Kurt Sandkuhl, Alexander Smirnov,
Search engines are the key to finding specific information on the vast expanse of the World Wide Web. Without sophisticated search engines, it would be.
Search Engines. Search Strategies Define the search topic(s) and break it down into its component parts What terms, words or phrases do you use to describe.
ICDL 2004 Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science Old Dominion University.
Search Engine Architecture
GUIDED BY DR. A. J. AGRAWAL Search Engine By Chetan R. Rathod.
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
IT-522: Web Databases And Information Retrieval By Dr. Syed Noman Hasany.
LOGO A comparison of two web-based document management systems ShaoxinYu Columbia University March 31, 2009.
The World Wide Web: Information Resource. Hock, Randolph. The Extreme Searcher’s Internet Handbook. 2 nd ed. CyberAge Books: Medford. (2007). Internet.
Search Tools and Search Engines Searching for Information and common found internet file types.
Design a full-text search engine for a website based on Lucene
Introduction to Information Retrieval Example of information need in the context of the world wide web: “Find all documents containing information on computer.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Automatic Metadata Discovery from Non-cooperative Digital Libraries By Ron Shi, Kurt Maly, Mohammad Zubair IADIS International Conference May 2003.
Information Retrieval
Web Information Retrieval Prof. Alessandro Agostini 1 Context in Web Search Steve Lawrence Speaker: Antonella Delmestri IEEE Data Engineering Bulletin.
Page Ranking Algorithms for Digital Libraries Submitted By: Shikha Singla MIT-872-2K11 M.Tech(3 rd Sem) Information Technology.
Oct 12-14, 2003NSDL Challenges in Building Federation Services over Harvested Metadata Kurt Maly, Michael Nelson, Mohammad Zubair Digital Library.
Chapter 1 Getting Listed. Objectives Understand how search engines work Use various strategies of getting listed in search engines Register with search.
The World Wide Web. What is the worldwide web? The content of the worldwide web is held on individual pages which are gathered together to form websites.
A search engine is a web site that collects and organizes content from all over the internet Search engines look through their own databases of.
Toward Semantic Search: RDFa based facet browser Jin Guang Zheng Tetherless World Constellation.
A System for Automatic Personalized Tracking of Scientific Literature on the Web Tzachi Perlstein Yael Nir.
Citation-Based Retrieval for Scholarly Publications 指導教授:郭建明 學生:蘇文正 M
WebDat: A Web-based Test Data Management System J.M.Nogiec January 2007 Overview.
Integrated Departmental Information Service IDIS provides integration in three aspects Integrate relational querying and text retrieval Integrate search.
General Architecture of Retrieval Systems 1Adrienn Skrop.
June 30, 2005 Public Web Site Search Project Update: 6/30/2005 Linda Busdiecker & Andy Nguyen Department of Information Technology.
SEMINAR ON INTERNET SEARCHING PRESENTED BY:- AVIPSA PUROHIT REGD NO GUIDED BY:- Lect. ANANYA MISHRA.
Reference Management Module I: Introduction By Rehema Chande-Mallya(PhD)
The Web Web Design. 3.2 The Web Focus on Reading Main Ideas A URL is an address that identifies a specific Web page. Web browsers have varying capabilities.
Crawling When the Google visit your website for the purpose of tracking, Google does this with help of machine, known as web crawler, spider, Google bot,
Data mining in web applications
Information Retrieval in Practice
Information Architecture
Metasearch Thanks to Eric Glover NEC Research Institute.
David Huynh, Stefano Mazzocchi, David Karger Piggy Bank: Experience the Semantic Web inside your web browser Web Semantics: Science, Services and Agents.
Search Engine Architecture
Supervisor: Prof Michael Lyu Presented by: Lewis Ng, Philip Chan
Chapter Five Web Search Engines
SEARCH ENGINES & WEB CRAWLER Akshay Ghadge Roll No: 107.
Search Engine Architecture
Thanks to Bill Arms, Marti Hearst
Quick guide < Keyword search >
What is a Search Engine EIT, Author Gay Robertson, 2017.
Building an autonomous citation index for grey literature: the
Data Mining Chapter 6 Search Engines
Introduction of KNS55 Platform
Search Engine Architecture
Information Retrieval and Web Design
Presentation transcript:

Submitted By: Usha MIT-876-2K11 M.Tech(3rd Sem) Information Technology Digital Libraries Submitted By: Usha MIT-876-2K11 M.Tech(3rd Sem) Information Technology

Need of Digital Library Now days, researchers are making their work available online in the form of postscript or PDF documents. To access this growing body of scientific literature we need Digital Libraries.

What is Digital Library A digital library is an integrated set of services for capturing, cataloging, storing, searching, protecting, and retrieving information, which provide coherent organization and convenient access to typically large amounts of digital information. As a consequence of the huge amounts of digital content becoming available, modern search engine technologies are now being introduced in digital libraries to retrieve the relevant content.

Architecture of Digital Library Search System

Modules of System Crawler Document Parser Indexing Module Database Search & Browsing Sub-Agent Web Browser Interface

Crawler The main component of the digital library search system is a crawler that traverses the hypertext structure in the web, downloads the web pages or harvest the desired papers published in specific venue (e.g. a conference or a journal) and stores them in database. It is an agent to automatically locate and acquire research publications.

Document Parser It is a document parser and database creator. It extracts the semantic features from the downloaded documents and places them into a database as parsed documents.

Indexing Module The parsed documents are routed to an indexing module that builds the index based on the keywords present in the pages. Various ranking methods are also implemented in this module to present relevant results to users according to their needs.

Database Search & Browsing Sub-Agent It consists of a query processing sub-agent which takes a user query of proper syntax and returns an HTML formatted response to the user. The query processing sub-agent provides several different browsing capabilities that allow a user to easily navigate through the document database. Although search by keyword is supported, there is emphasis on using the links between “citing” and “cited” documents to find related research papers.

Web Browser Interface It is the interface between user and the main system. User fires a query in the form of keywords on the web browser interface of a digital library search engine. Results are also displayed on this interface to the user.

Advantages of Digital Libraries Digital Libraries improves upon manual search process in three ways: It automates the tedious, repetitive, and slow process of finding and retrieving Web based publications. Once potentially relevant papers are retrieved, it guides the user towards interesting papers by making them searchable. When a relevant paper is found, it helps the user by suggesting other related papers using similarity measures derived from semantic features of the retrieved documents.

CONCLUSION In this I presented an agent that automates and enhances the task of finding interesting and relevant research publications on the World Wide Web. It can save researchers a great deal of time and effort in the process of a literature search.

REFERENCES A Comparative Study of Page Ranking Algorithms for Online Digital Libraries by Sumita Gupta, Neelam Duhan, Poonam Bansal. Citeseer-An Autonomous Web Agent for Automatic Retrieval and Identification of Interesting Publications By Kurt D. Bollacker, Steve Lawrence and C. Lee Giles.

THANK YOU

QUERIES???