Download presentation
Presentation is loading. Please wait.
1
Making the Most of Your Content
2
About the Speaker Rem Purushothaman
19+ years in information access 9+ years in enterprise search 16 million unique visitors per day Among highest search volumes on the web Areas of focus Intranet,Internet,and Extranet projects across all major verticals 2
3
Some Knowledgeof DocumentManagement in SharePoint
3
4
ECM in 2013 – Convergence & Usability
Individual Team Organization 4
5
ECM and Search Sources: 6
6
Common Challenges Missing or poor metadata Duplicate documents
Siloed information Repositories with unknown content Incorrect security or retention Haroon Suleman, enterprise search architect at Mercer, deploys search across 40 million critical documents stored in file systems, SharePoint, Livelink, and the corporate personnel directory. Noting that a query for “(client name) plus proposal” will get thousands of hits, he concentrates on deduplication and entity extraction and “lets the search engine do the hard work wherever possible.” Source: Forrester Research 7
7
The LongTail of ECM Holistic Approach TRADITIONAL ECM Managed Content
Unmanaged Content Holistic Approach USER PARTICIPATION Reduce cost and complexity Scale to 100% of users and content under management CONTENT TYPES 8
8
Metadata is Essential ListColumns ManagedMetadata Localtermsets
Globaltermsets Managedinthecontext Managedoutofcontext Static Dynamic Managedbyadmins Managedbyowners CanbeimportedfromCSV 10
9
Without Metadata,it Gets Really Hard
<HEAD> <TITLE>Stamp Collecting World</TITLE> <META name="description" content="Everything you wanted to know about stamps, from prices to history."> <META name="keywords" content="stamps, stamp collecting, stamp history, prices, stamps for sale"> </HEAD> Enterprise Search 11
10
Metadata in ECM and Search
DEMO
11
2 Components to Machine-Made Metadata
Content tagging: Concepts (vector based information) Entities (noun phrases) Author specific tags (explicit content tags) Content classification Hierarchy (taxonomy) Object to object classification (sets) Rules (linguistics, semantic, etc.) 13
12
Entity Extraction vs. AutoClassification
Closed vocabulary:projects (OOB from term store) Open vocabulary:organizations (OOB in SP2013) OOB,custom,and 3rd party Title Sales Forecast Companies Contoso Tailspin Toys Woodgrove Bank … People,places,domain-specific (proteins,courts,…) Expertise Strategic Each entity is detected each time (can provide counts) Hierarchical classifiers (not OOB,add-on) Works on hierarchies Consulting Market Analysis IT Implementation … Industry Financial Services Manufacturing Europe->France->Paris vs. NorthAmerica->USA->Maine->Paris Technology ... Tags once for whole document “Process in place” vs.“Process during indexing” 14
13
The“Term Store” Service management Term Store Group Term Set Term
30k terms per term set (max 1 million total) Many term per group Description • Translations • Custom properties 16
14
LongitudeAutoClassifier
Create taxonomy from example resources and documents Taxonomy 1 Content Repositor(ies) Content Connectors and Crawler Manager Automatically classify documents to taxonomies 2 Term Store (SharePoint) 3 Taxonomy Service Search Index Annotator Indexer Use taxonomy to find and explore content 17
15
Creating Metadata by Machine
DEMO
16
Meta-data Driven Scenarios
Scope ranges from local to global Metadata“control” ranges from formal managed taxonomies through social tags 19
17
Content
18
How Much doTheyActually Index?
1% 5.6% 1.7% Size of the web > 1,000,000,000,000 unique links to pages found Source: Google Blog 23
19
What do I Index First? Prioritize data sources Favor:
1 Highbusiness value 2 7 3 4 5 Dif 6 ficultROI Prioritize data sources Favor: Highly authoritative Important to largest audience Less complexity Avoid: Low authority Small audience High Authority Medium Low Highly complex High Low complexity Average Size of Audience Normal complexity Low High complexity 24
20
A Note on Indexing… If it’s not indexed,you can’t find it!!
You can index content from anywhere,not just SharePoint 25
21
Configuring and Extending Crawling
Configure OOB connectors and crawls Scheduling! Be aware of your sources and full/incrementals Build your own connector Built from SPD for simple databases and web services Built connectors shared across SharePoint Search and FAST search Buy prebuilt 3rd party connector Knowledge of source Handle complex security,metadata enhancement Leverage framework and build on it Managed .NET assembly BCS connector or custom BCS connector 3rd party frameworks 26
22
WritingYour Own Connector
Tailoring crawls by getting into code Capabilities beyond OOB connectors: Time stamp based incremental crawl Change log + delete log crawl Support for attachments Item level security Associated content SPC213:ContentAcquisition for Search in SharePoint 2010 27
23
Connectors and Enterprise Content
Unified Search Index Security Mapping Metadata Enrichment Change Log Targets Search Optimized API SharePoint SharePoint Libraries & Lists 28
24
WorkingAcross Multiple Repositories DEMO
25
Summary 30
26
@RemSearchPro
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.