Download presentation
Presentation is loading. Please wait.
1
Searching uPortal with a third party Search Engine Katya Sadovsky University of California, Irvine Administrative Computing Services katya@uci.edu
2
Agenda §Our goals §Our current setup §Built-in vs. Third Party Search Engine §Dynamic vs. Static Content §Issues in combining uPortal with a search engine §Demonstration §Questions & Answers
3
Our goals §Use the portal as a “gateway” to information §Allow users to search for pertinent portal content §Present users with integrated search results (portal and non-portal content) §Aid the search engine in weighing the results (meaningful page title, metadata, etc.)
4
Our current setup §uPortal 2.0.3 §Verity Ultraseek Search Engine (formerly Inktomi) §Tomcat 4.0.6
5
Built-in vs. Third Party Search Engine §Pros to using a built-in search engine: l Ensure generation of correct links to content l Present users with customized (user-specific) result sets l Ability to fully utilize channel metadata l Employ portal’s authorization infrastructure
6
Built-in vs. Third Party Search Engine §Pros to using a third party search engine l Well tested mature functionality l Well developed dictionary and thesaurus l Ability to search content beyond uPortal and present users with integrated search results l URL filtering capabilities l Useful but optional: nice administrative GUI, quick link definitions
7
Dynamic v.s. Static Content §uPortal generates dynamic content that depends on user's preferences, security level, browser and operating system §Most search engines are designed to work with static content: l Search engines index content on a periodic basis and use cached/stored index to present user with search results l Search results are not user-specific l Only public content is indexed
8
Issues/Areas of difficulty §User Agent setting §Filtering out certain URLs §Deciding what to search: l Search index/start page l Searchable v.s. non-searchable content §Generating links to channels using: l global (published) vs. instance (subscribed) ID l functional names §Page title used in search results
9
User Agent §Issues: l uPortal needs to know the mapping between a user agent and a MIME type/output type l When user agent is not recognized, uPortal will display a screen allowing users to choose a profile to use §Solutions: l If you know the user agent reported by the search engine – add a mapping to the UP_USER_UA_MAP table l Choose a search engine that allows you to specify a user agent
10
Example: setting a search engine user agent
11
Filtering out certain URLs §Issues: l A search engine may follow a link that includes a channel option or command l uPortal URL tags: Dynamically generated for each URL hit Tags, other than 'idempotent' make search result senseless While indexing content, a search engine may enter a loop referencing the same page with different tags
12
Filtering out certain URLs (cont’d) §Solutions: l acquire a search engine that allows URL filtering and filter out all “offending” URLs l If available with the search engine, use advanced URL “de-duping”
13
Example: Filtering out certain URLs
14
Example: using URL filters
15
What to search: index/start page §Issues: l A user layout may not be used as a starting point for a search engine: a typical layout doesn't contain all the channels l Need a page with 'idempotent' links to all the searchable channels §Solutions: l Searchable Channel Index channel
16
What to search: searchable v.s. non-searchable content §Issue: l not all channels needed to be included in the search §Solution: l added a 'searchable' attribute to all the channels
17
CSearchRegistry channel
18
CSearchRegistry: stylesheet
19
Generating links to channels §Problem: channel instance (subscribed) IDs vary from user to user, so the search result links are inconsistent §Solutions: link to channels using l global (published) IDs -- involves code changes l functional names (fname) -- this is a new functionality, available in CVS (Concurrent Versions System)
20
Linking to channels via their published IDs: implementation plan §Modified org/jasig/portal/UserInstance.java to recognize that user is asking for a published channel that may not be in user’s layout §Create a temporary hidden folder in user’s layout to store “temporary” channels (make sure to delete this folder before layout is saved to the database) §Add XML channel definitions to this hidden folder §Proceed to render as usual
21
Page titles used in search results §Issues: l Out of the box, uPortal has a statically set page title (no matter what channel is viewed) l Search engines generally use page titles (or other metadata) for: search result titles result ranking de-duping l Users have to be trained to enter meaningful page titles when creating documents/channels (e.g. do not start each page title with UCIrvine)
22
Page titles used in search results §Solution : when channels are rendered in 'focused’ or ‘detached’ mode, add channel title to the default page title (following is a fragment of webpages/stylesheets/org/jasig/portal/layout/tab-column/nested- tables/nested-tables.xsl ): §............
23
Example: page titles
24
Conclusions §There are tradeoffs when using either a built-in or a third-party search engine §We have yet to address the following issues: l searching restricted content l creating META data tags to help the search engine with content ranking §Overall, our portal project could not succeed without a search function
25
Links §UC Irvine’s uPortal installation (SNAP): http://snap.uci.edu http://snap.uci.edu §This presentation: http://snap.uci.edu/PortalDocs/uPortal_Search.ppt http://snap.uci.edu/PortalDocs/uPortal_Search.ppt
26
Demo
27
Questions ?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.