Search can be Your Best Friend You just Need to Know How to Talk to it IW 306 Ágnes Molnár
About me Ágnes Molnár, MOSS MVP, MCSD, MCT Senior Consultant, R&D Director L&M Solutions, Budapest, HUNGARY
Search can be Your Best Friend You just Need to Know How to Talk to it
WHY? Information overload Findability Gartner: 8 hours / week / information worker IDC: 9.5 hours / week / information worker Searching without finding: 3.5 hours / week / information worker COMPLETELY WASTED!
Business Requirements What? Where? How?
Business Requirements DataInformationKnowledge DescriptionProperties that represent objects, events, elements, etc. Data that can be used Information put to action and/or integrated with other information. Provides a basis for decisions and planning actions. Supporting SharePoint technology - Lists hold data and metadata - Libraries hold documents (with data and information) Lists and libraries- Documents - Social networking
Plan the Searching
Physical Architecture: Server Roles Index Server Crawling Indexing Content Sources Protocol handlers: HTTP, FTP, File, BDC, Lotus Notes, Custom Word Breaker Noise Word Removal iFilter s Conte nt Databa se Full Text Index Metadata and Permissions
Physical Architecture: Server Roles Query Server Accept search queries from users Build return set Resturn results Database Server Index Server Query Servers Web Front- End Servers b 3a 4a ? 4b
Physical Architecture: Scaling out More Index Server More than 50 (10) million documents Too long crawling time Second SSP needed More Query Server Need to include content that cannot be crawled Query demand is rising More Index Server
Content Sources When to crawl?When NOT to crawl? SharePoint sitesIf you want to EXCLUDE its content File ShareDocument search requirements Business Data- External data search requirements - Integrated solutions Website Exchange Public FolderImportant business information
Search Scopes Refine the queries Scope Rules Web address Property query Content source All content
Keywords and Best Bets Keyword: to mark specific items as more relevant they show up more prominently in the search results Best Bet: relevant items that you can choose for a subject
Authoritative Pages Authoritative Page: A page that is considered a better match for any given search term
KEYWORDS AND BEST BETS
Federated Search Advantages Conserve resources by crawling and indexing Can include content that cannot be crawled Latest information from different content sources
Federated Search Disadvantages Unable to configure ranking within the result set Unable to control which results appear in the result set Cannot scope the results Cannot combine the results into a single result set The more search webparts on the same page the more time to load
FEDERATION
Federate or not? YES remote site’s robots.txt blocks SharePoint’s crawler you need results only with specific keywords and/or keyword patterns in the query content changes very often, immediately crawling needed queries under different security context infrequently queried contents >500 content sources NO You don’t have enough bandwith content changes very often, but immediately crawling NOT needed content that is not indexed by the remote server remote server does not return with RSS or Atom
Findability Best Practices Use Scopes Use Master Site Directory Train your users URLs and Managed Paths Content Types – Describing and Tagging My Sites and User Profiles Blog, Wiki Collaboration Knowledge Sharing
User Interface UI scenarios: SharePoint Browser integration Custom application Separate several search results Easy to search – easy to use Use RSS / alerts
USER EXPERIENCE
The Magic Word: SEO (Search Engine Optimalization) The process of optimizing sites and pages for search engines to result in better relevance and ranking for the site.
SEO Best Practices – DO Use keywords Place your content as high up in the page as possible to get it more relevance Use clear site hierarchy – every page has to be reachable Check for broken links Use a text browser (eg. LYNX) to examine your site Test your site in different browsers
SEO Best Practices – DO Use keywords Place your content as high up in the page as possible to get it more relevance Use proper semantic codes: tags (title, description) Headlines (,,...) List items (,, ) Images:, Use descriptive text in your hyperlinks Use descriptive page titles Build site map Use valid HTML and XML
SEO Best Practices – DO NOT Don’t name all pages with the same page title Don’t load your pages with irrelevant keywords Don’t use complex URLs Don’t use temporary redirects Don’t use complex pages Avoid web spammers
SEO Best Practices MAKE YOUR PAGES PRIMARILY FOR USERS, NOT FOR SEARCH ENGINES!!! NOT FOR SEARCH ENGINES!!!
Thank you for attending! Please be sure to fill out your session evaluation!