Presentation is loading. Please wait.

Presentation is loading. Please wait.

Making the Most of Your Content.

Similar presentations


Presentation on theme: "Making the Most of Your Content."— Presentation transcript:

1 Making the Most of Your Content

2 About the Speaker Rem Purushothaman
 19+ years in information access  9+ years in enterprise search  16 million unique visitors per day  Among highest search volumes on the web  Areas of focus  Intranet,Internet,and Extranet projects across all major verticals 2

3 Some Knowledgeof DocumentManagement in SharePoint
3

4 ECM in 2013 – Convergence & Usability
Individual Team Organization 4

5 ECM and Search Sources: 6

6 Common Challenges  Missing or poor metadata Duplicate documents
Siloed information Repositories with unknown content Incorrect security or retention Haroon Suleman, enterprise search architect at Mercer, deploys search across 40 million critical documents stored in file systems, SharePoint, Livelink, and the corporate personnel directory. Noting that a query for “(client name) plus proposal” will get thousands of hits, he concentrates on deduplication and entity extraction and “lets the search engine do the hard work wherever possible.” Source: Forrester Research 7

7 The LongTail of ECM Holistic Approach TRADITIONAL ECM Managed Content
Unmanaged Content Holistic Approach USER PARTICIPATION Reduce cost and complexity Scale to 100% of users and content under management CONTENT TYPES 8

8 Metadata is Essential  ListColumns ManagedMetadata Localtermsets
Globaltermsets Managedinthecontext Managedoutofcontext Static Dynamic Managedbyadmins Managedbyowners CanbeimportedfromCSV 10

9 Without Metadata,it Gets Really Hard
<HEAD> <TITLE>Stamp Collecting World</TITLE> <META name="description" content="Everything you wanted to know about stamps, from prices to history."> <META name="keywords" content="stamps, stamp collecting, stamp history, prices, stamps for sale"> </HEAD> Enterprise Search 11

10 Metadata in ECM and Search
DEMO

11 2 Components to Machine-Made Metadata
Content tagging:  Concepts (vector based information)  Entities (noun phrases)  Author specific tags (explicit content tags) Content classification  Hierarchy (taxonomy)  Object to object classification (sets)  Rules (linguistics, semantic, etc.) 13

12 Entity Extraction vs. AutoClassification
 Closed vocabulary:projects (OOB from term store)  Open vocabulary:organizations (OOB in SP2013)  OOB,custom,and 3rd party Title Sales Forecast Companies Contoso Tailspin Toys Woodgrove Bank People,places,domain-specific (proteins,courts,…) Expertise Strategic  Each entity is detected each time (can provide counts)  Hierarchical classifiers (not OOB,add-on)  Works on hierarchies Consulting Market Analysis IT Implementation Industry Financial Services Manufacturing Europe->France->Paris vs. NorthAmerica->USA->Maine->Paris Technology ...  Tags once for whole document “Process in place” vs.“Process during indexing” 14

13 The“Term Store” Service management Term Store Group Term Set Term
30k terms per term set (max 1 million total) Many term per group Description • Translations Custom properties 16

14 LongitudeAutoClassifier
Create taxonomy from example resources and documents Taxonomy 1 Content Repositor(ies) Content Connectors and Crawler Manager Automatically classify documents to taxonomies 2 Term Store (SharePoint) 3 Taxonomy Service Search Index Annotator Indexer Use taxonomy to find and explore content 17

15 Creating Metadata by Machine
DEMO

16 Meta-data Driven Scenarios
Scope ranges from local to global Metadata“control” ranges from formal managed taxonomies through social tags 19

17 Content

18 How Much doTheyActually Index?
1% 5.6% 1.7% Size of the web > 1,000,000,000,000 unique links to pages found Source: Google Blog 23

19 What do I Index First? Prioritize data sources Favor:
1 Highbusiness value 2 7 3 4 5 Dif 6 ficultROI Prioritize data sources Favor:  Highly authoritative  Important to largest audience  Less complexity Avoid:  Low authority  Small audience High Authority Medium Low  Highly complex High Low complexity Average Size of Audience Normal complexity Low High complexity 24

20 A Note on Indexing…  If it’s not indexed,you can’t find it!!
 You can index content from anywhere,not just SharePoint 25

21 Configuring and Extending Crawling
 Configure OOB connectors and crawls  Scheduling! Be aware of your sources and full/incrementals  Build your own connector  Built from SPD for simple databases and web services  Built connectors shared across SharePoint Search and FAST search  Buy prebuilt 3rd party connector  Knowledge of source  Handle complex security,metadata enhancement  Leverage framework and build on it  Managed .NET assembly BCS connector or custom BCS connector  3rd party frameworks 26

22 WritingYour Own Connector
Tailoring crawls by getting into code Capabilities beyond OOB connectors:  Time stamp based incremental crawl  Change log + delete log crawl  Support for attachments  Item level security  Associated content SPC213:ContentAcquisition for Search in SharePoint 2010 27

23 Connectors and Enterprise Content
Unified Search Index Security Mapping Metadata Enrichment Change Log Targets Search Optimized API SharePoint SharePoint Libraries & Lists 28

24 WorkingAcross Multiple Repositories DEMO

25 Summary 30

26 @RemSearchPro


Download ppt "Making the Most of Your Content."

Similar presentations


Ads by Google