Download presentation
Presentation is loading. Please wait.
1
The University of Kansas Vitalseek Dr. Susan Gauch
2
The University of Kansas Overview Provide technical and research capabilities for a Kansas City startup company Partner with Today Communications, Inc. to provide high quality, online medical information Develop innovative, quality based rankings of online Web pages Transition technology for easy adaptation on behalf of potential clients, easy maintenance for sponsoring company
3
The University of Kansas Project Goals System to support online entry of human judgments for a wide variety of medical Web sites on a large number of criteria Novel search engine combining traditional keyword-based retrieval with user-selected quality criteria Speed Scalability Reliability
4
The University of Kansas Ranking System -Online entry, viewing, validation, modification -Over 150 criteria per site -Sites rated -Overall -Per Topic (50 topics)
5
The University of Kansas Spider Automatically collect Web pages from Web sites –Keys off of sites as they are entered in ratings database Continuous loop –Visit sites –Index content –Revisit sites Multiple, concurrent spiders on a dedicated machine –Co-ordination –Speed
6
The University of Kansas Indexing Documents Initially, all documents are indexed together –Time Bottleneck (4+ days to index) –Space Bottleneck (resulting file exceeds system limits) Revised version –Each site indexed separately Can visit, index in a loop site by site –But, must select a subset of the collections to process for each query Classic distributed information retrieval problem
7
The University of Kansas Retrieval System - Broker Given a query and a set of criteria Phase I – Broker –Select those web sites that meet the criteria E.g., Privacy, Authority, Navigation –Select those sites that have the best content from among the first set Number of documents with the query words –Send the query to the top N sites (approx. 10)
8
The University of Kansas Retrieval System – Query Processing For each site, –Identify the top documents for the query page weight with-respect-to query terms Site weight with-respect-to user criteria Combine these factors and rank the pages Fuse results from all sites –Merge the lists of pages based on weights –Rearrange as necessary to provide results from a mix of sites on each page
9
The University of Kansas Partner System Allows Vitalseek to be back end search engine Results appear as though from partner Web-based system for –Entering partners –Customizing results –Customizing search criteria
10
The University of Kansas Challenges Combining user criteria and keywords –Initial versions, used a weighted combination –Abandoned in favor of filtering version Scalability –Thousands of sites –Millions of pages Spidering and indexing speed System limits –Priority-based pruning of index files High-tech start-up demands, university research lab schedule
11
The University of Kansas Vitalseek.com
12
The University of Kansas Viewpoint filters
13
The University of Kansas Type of Site Filters
14
The University of Kansas Site Filters
15
The University of Kansas Content Filters
16
The University of Kansas Topic Filters
17
The University of Kansas Resource Filters
18
The University of Kansas Query: kidney
19
The University of Kansas Query: kidney in Diabetes
20
The University of Kansas URAC accredited sites only
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.