Download presentation
Presentation is loading. Please wait.
Published byElla Lindsey Modified over 9 years ago
1
Enterprise Search
2
Search Architecture Configuring Crawl Processes Advanced Crawl Administration Configuring Query Processes Implementing People Search Administering Farm-Level Settings In This Session …
3
Indexing and Search Architecture Server Roles Indexing Processes Protocol Handlers iFilters Word Breakers and Stemmers 32-bit and 64-Bit Index Propagation Query Processes Search Architecture
4
Indexing & Architecture
5
Server Roles Query Server Role Indexer Role Web Server Role Database Server Role
6
Indexing Process … Retrieves start address from content source … invokes a protocol handler to traverse Handler identifies content nodes (files, pages) Handler retrieves system level meta-data and access control lists Handler invokes the iFilter
7
Indexing Process iFilter retrieves content and meta-data Content and meta-data is parsed and added to full-text index Metadata and access control lists are added to the search database
8
Protocol Handlers Connect to and traverse content sources over a given protocol. Identify content, invoke iFilters, retrieve system-level metadata, and return content and metadata streams to the index engine
9
Protocol Handler Characteristics Web Protocol Handler SharePoint Protocol Handler File Protocol Hander Exchange Public Folder Protocol Handler Business Data Catalog Protocol Handler Lotus Notes Protocol Handler
10
iFilters Open content nodes in their native format. Filter out embedded formatting and retrieves content and properties iFilters included in MOSS 2007 Additional iFilters
11
Word Breakers in the Indexing Process Word Breakers at Query Time Stemmers Stemmers in the Indexing Process Stemmers at Query Time Enabling Language-Specific Stemmers Word Breakers & Stemmers
12
Query Servers and Web Servers Index Servers – Availability of Protocol Handlers – Availability of iFilters 32-Bit & 64-Bit
13
Master and Shadow Indexes Continuous Propagation from Index Server to Query Servers Between 3 and 30 seconds for an indexed document to be searchable Index Management
14
Query terms are collected by a Web server Query terms are supplemented with contextual information The Web server Initiates the Query Security-Trim the Results Query Processes
15
Creating Content Sources and Crawl Schedules Creating Crawl Rules Full and Incremental Crawls Optimizing Crawl Schedules Configuring Crawl Processes
16
A specification of a protocol handler with at least one start address Up to 500 content sources per Shared Service Provider Up to 500 start addresses per content source Content Sources
17
Allows for segmentation of the corpus into manageable sections Crawl Schedules
18
Adapt the Behavior of the Typical Crawl Process – Addresses can be pattern matched for special treatment – Supports exclusion rules and inclusion rules – Supports altering the authentication mechanism – Supports crawling SharePoint sites as HTTP pages Multiple Rule Order of Precedence – Rules applied in a configurable order Creating Crawl Rules
19
Full Crawls – Re-crawl existing indexed documents and new documents – Update crawl behaviors based on configuration changes in Office SharePoint Server 2007 Full Crawls
20
Incremental Crawls – Only crawl new or modified content – Dependence on Protocol Handler Characteristics – WSS 2.0 – WSS 3.0 Incremental Crawls
21
When Are Full Crawls Required? WSS 3.0 Change Log Management and Crawls Crawls
22
Full Crawls vs. Incremental Crawls – Full crawls will be required periodically Corpus Size Content Volatility Document Formats and Locations Index Freshness Segmenting Corpus for Crawl Optimization Optimizing Crawl Schedules
23
Managing File Types and iFilters Implementing Managed Properties Implementing Server Name Mappings Configuring Content Access Accounts Advanced Crawl Admin
24
Protocol Handlers Load iFilters Based on Configuration Settings File Types and iFilter Mappings Managed at the Shared Service Provider Level Best Practice Is to Also Modify the DOCICON.XML file to Display Appropriate Icons in Search Results File Types & iFilters
25
Managed Properties Combine multiple crawled properties into on managed property Used In: – Scope definitions – Advanced Search – Keyworld query syntax – Results
26
Protocol Handlers Load iFilters Based on Configuration Settings File Types and iFilter Mappings Managed at the Shared Service Provider Level Best Practice Is to Also Modify the DOCICON.XML file to Display Appropriate Icons in Search Results Server Name Mappings
27
Default Content Access Accounts Overridden Content Access Accounts Content Access Accounts and Versioning in Microsoft Office SharePoint Server 2007 – Full Reader account recommended for most scenarios Content Access Accounts
28
Implementing Scopes Configuring Advanced Search Properties Query Processes
29
Scopes Scopes Are a Logical View on an Index Scopes Are Based On … Scopes Are Defined by One or More Rules Can Be Used Throughout the Search Experience Implementing Scopes
30
Advanced Search Web Part – Search Term Options – Managed Property Options Result Web Parts Search Web Parts
31
Indexing User Profiles Indexing Social Networks People Scope People Search
32
What is People Search? People Search Based on Indexing User Profile Properties User Profiles can be Imported – Active Directory – LDAP Directories – Business Data Catalog Applications Indexing User Profiles
33
What Is Social Distance? Why Is it a Useful Concept? How Is it Implemented? – My Site Colleague Tracker – Outlook add-in for suggested colleagues Indexing Social Networks
34
People Search in Search Center – Dedicated tab – Dedicated Web Parts People Scope – Based on a Managed Property Query – contentclass=SPSPeople People Scopes & Results
35
Monitoring Enterprise Search Solutions Single Server Deployments Scaled-Out Database Server Deployments Scaled-Out Web Server Deployments Scaled-Out Query Server Deployments Consolidated Web and Query Servers Enterprise Search Indexing Performance Farm Level Settings
36
Standard Counters – Memory – Disk – Processor – Network Search Counters – Query Server – Index Server Monitoring Search
37
Scaling Up Single Server Deployments Typical Usage Single Server Deployment
38
Scaling Out DB Server MOSS 2007 Database Servers
39
Scaling Out Web Server WFE Query and Index Database
40
Scaling Out Query Server
41
Collapsed Web and Query
42
Search Indexing Performance Crawler impact rules – Parallel document indexing – Degree of parallelism Database Load Options Dedicated Web Server Option
43
DEMO Configuration of Search
44
Search Architecture Configuring Crawl Processes Advanced Crawl Administration Configuring Query Processes Implementing People Search Administering Farm-Level Settings Review
45
Twitter - @noidentity29 Email – sbray@go-planet.com Blog – www.shannonbray.com Questions ?
46
Thank You
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.