CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t OIS Andreas Wagner – CERN IT/OIS Eduardo Alvarez – CERN IT/OIS Sergio Fernandez – CERN.

Slides:



Advertisements
Similar presentations
Enterprise Search with FAST Rick McDannel Manager of Information Technology.
Advertisements

Enterprise Search with SharePoint Portal Server Level: 300 Collaboration and Business Productivity.
Intro to SharePoint 2013 Architecture Liam Cleary.
Great people, great experience, great passion Matthew McDermott Director Aptillon, Inc. SharePoint Search Center Configuration.
“ Leveraging SharePoint 2010 Search Technologies ” With: Ivan Neganov.
Agenda: Solomon N’Jie Overview Microsoft Enterprise Search Solution
Information Retrieval in Practice
beyond 10 blue links Making people more productive and driving business outcomes People & Expertise My Work Business Data Information Services.
1 Presented by Jacob Wilson
SharePoint de Contact Search and Find With SharePoint 2010 Thierry Gasser Technical Specialist Collaboration Platform
Implementation Considerations for FAST Search For SharePoint (FS4SP) Presenter : Shyam Narayan MOSSIG – February 2011 Meeting b:
Technical Overview of FAST Search Server 2010 for SharePoint Sezai Komur SharePoint Solutions Architect CSG.
Enterprise Search With SharePoint Portal Server V2 Steve Tullis, Program Manager, Business Portal Group 3/5/2003.
You can do it! Quick, easy, powerful search (for free!) Complete intranet search High-end search delivered through SharePoint Basic search Intranet-wide.
Microsoft ® Official Course Interacting with the Search Service Microsoft SharePoint 2013 SharePoint Practice.
Microsoft Office System SharePoint Portal Server 2003 Alex D. Wade Program Manager Information Worker Solutions Group Microsoft Corp. Search and Metadata.
SharePoint Server 2013 Architecture and Identity
Overview of Search Engines
Live Meeting APIs Robert Devine Program Manager Microsoft Corporation.
Presented by Jacob Wilson SharePoint Practice Lead/Principal Bross Group 1.
Enterprise Search. Search Architecture Configuring Crawl Processes Advanced Crawl Administration Configuring Query Processes Implementing People Search.
Operating Systems & Infrastructure Services CERN IT Department CH-1211 Geneva 23 Switzerland t OIS CERN Search Updates Eduardo Alvarez November.
Operating Systems & Information Services CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Web Content Management System Discussion.
Welcome to the Minnesota SharePoint User Group. Quick Intro Announcements Personalization in SharePoint Configuring User Profiles Configuring Audiences.
1 Enterprise Search From Microsoft Unlock the potential of your organization NameTitle Microsoft Corporation.
Edwin Sarmiento Microsoft MVP – Windows Server System Senior Systems Engineer/Database Administrator Fujitsu Asia Pte Ltd
Search Topology and Optimization April 12, 2013 Mike Maadarani SharePoint Architect.
First Look Clinic: What’s New for IT Professionals in Microsoft® SharePoint® Server 2013 Sayed Ali (MCTS, MCITP, MCT, MCSA, MCSE )
Hands-On Microsoft Windows Server 2008 Chapter 1 Introduction to Windows Server 2008.
CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Ideas for 2011 Prepare must be done work items –Warranty –Software maintenance –Commitments.
Operating Systems & Information Services CERN IT Department CH-1211 Geneva 23 Switzerland t OIS CERN Single Sign-On Summer 2012 Updates Emmanuel.
The SharePoint Revolution Microsoft Australian Partner Conference Gold Coast 2010 Presenter : Domenic Chiera Title : Productivity Specialist Company :
Building Search Portals With SP2013 Search. 2 SharePoint 2013 Search  Introduction  Changes in the Architecture  Result Sources  Query Rules/Result.
Virtual techdays INDIA │ august 2010 FAST Search for SharePoint 2010 Allirajan Ramachandran │ Technology Specialist, Microsoft Corp
Sustainable SharePoint 2010 Customizations By Bill Keys.
Operating Systems & Information Services CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Make the most of Office 2010, Expression.
Search 2013 Thierry Gasser Technical Solution Professional (TSP)
WCM Platform Improvements ECM and Enterprise Metadata Advanced Routing and Document Sets In Place Records Management.
CERN IT Department CH-1211 Geneva 23 Switzerland t OIS CERN IT-OIS Tim Bell, Eduardo Alvarez Fernandez, Andreas Wagner HEPiX Fall 2010 Workshop.
SharePoint 2010 Search Architecture The Connector Framework Enhancing the Search User Interface Creating Custom Ranking Models.
User Experience Takes user input, displays results Search Engine Builds index, returns results Content Processing Retrieves content, prepares for indexing.
Meeting # 75 Meeting # 75 Welcome to the Minnesota SharePoint User Group February 9 th, 2011 SharePoint.
FAST ESP for SharePoint Integration of FAST ESP and SharePoint 2007 Insert Your Name Here Insert title and company here.
European Laboratory for Particle Physics NICE NT Web Services Alberto Di Meglio CERN IT/DIS/NCS.
Module 10 Administering and Configuring SharePoint Search.
0 SharePoint Search 2013 Rafael de la Cruz SharePoint Developer Seneca Resources twitter.com/delacruz_rafael
Solutions using Microsoft Content Management Server 2002 Connector for SharePoint Technologies Sue Corke Mark Harrison Microsoft UK.
Copyright © 2006 Pilothouse Consulting Inc. All rights reserved. Search Overview Search Features: WSS and Office Search Architecture Content Sources and.
Microsoft Office SharePoint Server 2007 Enterprise Search Enterprise Search Overview.
ON YOUR TERMS Business needs * Enhanced by upcoming Azure IAAS features GoodBetterBest * * GoodBetterBestGoodBetterBestGoodBetterBestGoodBetterBestGoodBetterBest.
April-June 2006 Windows Hosting Seminar Series Technical Labs.
Unplugged FAST meets SharePoint (FS4SP)
Module 9 User Profiles and Social Networking. Module Overview Configuring User Profiles Implementing SharePoint 2010 Social Networking Features.
Búsqueda en SharePoint 2010: una introducción. Quick, easy, powerful search (for free!) Complete intranet search High-end search delivered through SharePoint.
Module 1: Overview of Microsoft Office SharePoint Server 2007.
Compliance & Information Management Paul Quirk. Core Infrastructure People Productivity Customer Management Product & Service Development Operations &
11 Why tune relevance Because we want to find the one single best item, among a large group of possible candidates….
Operating Systems & Information Services CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Drupal at CERN Juraj Sucik Jarosław Polok.
CERN - IT Department CH-1211 Genève 23 Switzerland t Operating systems and Information Services OIS Proposed Drupal Service Definition IT-OIS.
CERN IT Department CH-1211 Genève 23 Switzerland t Single Sign On, Identity and Access management at CERN Alex Lossent Emmanuel Ormancey,
CERN - IT Department CH-1211 Genève 23 Switzerland t CERN - IT Department CH-1211 Genève 23 Switzerland t SharePoint 2007 deployment.
Leveraging SharePoint Search In SharePoint 2013 Jameson Bozeman.
Introduction to Enterprise Search Corey Roth Blog: Twitter: twitter.com/coreyrothtwitter.com/coreyroth.
Search can be Your Best Friend You just Need to Know How to Talk to it IW 306 Ágnes Molnár.
Information Retrieval in Practice
Microsoft Office SharePoint Server 2007 Enterprise Search
Business Connectivity Services in SharePoint 2010 and Office 2010
What is SharePoint and why you should care
Data Mining Chapter 6 Search Engines
Presentation transcript:

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Andreas Wagner – CERN IT/OIS Eduardo Alvarez – CERN IT/OIS Sergio Fernandez – CERN IT/OIS

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Summary Introduction to search Inside CERN Search New Search Solution –Concepts, collections, pipelines, stages, architecture –Search features Demo Conclusions and future work Presentation Title - 2

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS What is Search? Search is the art of balancing three factors: –Recall How many matching documents were returned? –Precision Of returned documents, how many match the query? –Relevancy How well does a document match the query? Presentation Title - 3

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Enterprise Search Wide range of document sources: CERN Search - 4 Web Pages File systems Databases Directories (People and Places) Document repositories (CDS, EDMS, Indico, …) Structured CMS Data Sharepoint, Drupal, Twiki Variety of meta data Different Access Protection Schemes Different retrieval methods and frequencies

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Enterprise Search Components of Enterprise Search: –Search Engine / Search Technology –Integration within existing infrastructure (authentication, authorization) –Document retrieval Not only Web pages Database/XML data (CDS, Indico, Phone data) –Protected documents Access for document data In addition information about ACLs needed –Ranking of documents –Enterprise Search is not only a question about the search technology used! CERN Search - 5  collaboration with data owners

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS What about Google What makes Google Web search so good –Huge Web space analysis capabilities, –Huge usage data used for “voting” the results  most popular results are promoted –Substantial resources to tune and correct results; - usage data analysis - taking into account popular events - hand edited results for popular single key word searches –Personalize filter of results Based on : Location, Preferences, search historial, … Above is valid for all public web search engines, Yahoo, Bing At the same time Web Search is not Enterprise search! CERN Search - 6

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Summary Introduction to search Inside CERN Search New Search Solution –Concepts, collections, pipelines, stages, architecture –Search features Demo Conclusions and future work CERN Search - 7

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Search at CERN? Why Search Service? If… –Every systems usually has its own search system Probably one of the best place for this service Quite a lot different content sources High rate of new content Solutions are not always optimal Centralize the search of content Presentation Title - 8

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS CERN Search A Central Search solution to provide for users –Single entry point for searching information on several content sources at the same time for service providers –Search backend service »TWiki, Drupal, Sharepoint, JACOW, Groups Start of project in February 2006: Based on commercial product from FAST (Microsoft subsidiary and market leader) CERN Search in production since 2007 Present resources 1 PJAS & some fraction of a staff CERN Search - 9

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS CERN Search Last Progress 2009 Migration to FAST ESP Reorganization of the Indexed Web Space (Improved relevancy) 2010 – Twiki protected pages indexed –Service used as default Twiki search 1Q 2011 – Indico Protected Docs + Material 1Q 2011 – Index of the Sharepoint content 3Q 2011 – Migration to FAST Search Server 2010 for Sharepoint Presentation Title - 10

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Overview of Indexed documents CERN Search - 11

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Summary Introduction to search Inside CERN Search New Search Solution –Concepts, collections, pipelines, stages, architecture –Search features Demo Conclusions and future work CERN Search - 12

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS ConceptsI Document Pipeline Processing Stage Presentation Title - 13 Collection Crawler (Files, Web) Collection A Collection B

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS CERN Search - 14 Concepts II Content API Query API Filter API Connectors (Push&Pull) Document retrievalDocument indexingDocument processing Document Content Flow

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS CERN Search - 15 Indexing Protected Content I To allow indexing protected content we need to Retrieve the document Search engine needs access to document Obtaining document ACLs To be able to decide who is allowed to find a document Often not trivial since most systems answer the question: “Has a given user the right to access a given document?” and not “Tell me who has access to a given document?” This is due to often complex permission models including inheritance, fine granularity of permissions and changing permission during document lifecycle …

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS CERN Search - 16 Indexing Protected Content II Document Processing Resolve ACLs to SIDs Sent to Indexer with document FSA (FAST Security Authorization) Component Active Directory integration, i.e. based on CERN accounts and e-groups Search Index CERN Search Document Repository Document Processing Active Directory Users & Groups Doc + ACL ACL Document

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS CERN Search - 17 Authentication / Authorisation CERN Search Active Directory Users & Groups Search Index Search Front End Query & Identity Group Membership Authentication (SSO) & Search Query Processing Authentication by Front-End FSA creates filter with expanded user credentials and groups

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS FAST Search for Sharepoint Cluster Architecture Presentation Title - 18

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Index Profile Final representation of each document Set of attributes to index (Managed Prop) –Title –Author –Last modified date –ACLs Define properties queryables, refiners, sort Define FullTextIndex Properties Define mappings to FullTextIndex Flexible Presentation Title - 19

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Result Ranking – Rank Profiles CERN Search - 20

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Ranking Issues at CERN Flat Web space –Lack of metadata (Copy-Paste, not well meta html tags,...) –Isolated sites (not many inter-links, only CERN main page) Good experience with well structured content –Indico, CDS How to improve ranking? –Manual Tuning of results, promote, demote –Modify rank profile –Custom processing stage for static rank points Not easy, –Manpower intensive –Better understand of data indexed –Not magic solution, balance rank profile for different collections CERN Search - 21

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Changes on FAST ESP products Before only one product –FAST ESP 5.3 (Standalone product) Now, several possibilities –FAST FAST Search Server 2010 for Internal Applications (FSIA) FAST Search Server 2010 for Internet Sites (FSIS) –Microsoft + FAST FAST Search Server 2010 for Sharepoint (FS4SP) –Same core –Configuration and OTB pipeline adapted for Sharepoint –Reduced set of tools, others migrated to Sharepoint or Powershell cmdlets Presentation Title - 22

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS FAST Search for Sharepoint Arquitecture Overview Presentation Title - 23

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS FAST Search for Sharepoint Topology Presentation Title - 24 Sharepoint Crawler Sharepoint Sites Web Sites File Shares Exchange public folders Lotus Notes FAST Enterprise Crawler Search Centre

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Server Architecture Two systems (Production + Dev) Using Sharepoint Central Service Production –1 admin node –1 crawler + pre-processing node –4 nodes index cluster Both roles Indexer and Search 2 rows –Backup –Query performance 2 columns –Easy handle more than 30 million documents –High reliability on critical components Content Distributors, QueryServers, Document Processors Presentation Title - 25

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Fast Search for Sharepoint New features (I) New Query Suggestions model –Based on dictionary and common user queries Best Bets & Visual Best Bets Custom search experience (per user/role) New management system (microsoft style) –SCOM, Powershell,… Sharepoint integration Phonetic and nickname search Thumbnails and previews in results Presentation Title - 26

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Fast Search for Sharepoint New features (II) Entity extraction Office Web Apps integration Relevance improvements with social behaviour –Click-through relevancy Enhanced Results Refinement –Deep results refinement –Based on any managed properties –Similar results Federation Search Presentation Title - 27

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Migration Process Migrate Pipelines Adapt Retrieval and Pre-processing scripts Port Custom processing stages Migrate feed process to use Sharepoint Crawlers (Files Shares) Customize Search Centre to offer same functionality than old system Create general helpers tools –Manage index profile –Manage keywords, best bets,… Presentation Title - 28

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Examples Best Bets & Visual Best Bets Presentation Title - 29

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Examples Visual Refiners Presentation Title - 30

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Examples Federation search examples (google, bing, twitter) Presentation Title - 31

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Search Driven Application Presentation Title - 32

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Summary Introduction to search Inside CERN Search New Search Solution –Concepts, collections, pipelines, stages, architecture –Search features Demo Conclusions and future work CERN Search - 33

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Summary Introduction to search Inside CERN Search New Search Solution –Concepts, collections, pipelines, stages, architecture –Search features Demo Conclusions and future work CERN Search - 34

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Conclusions Succesfully migrated all the content from old system –Experience in the same technology Reduced tools and help for other content than Sharepoint But, –New interesting features, Sharepoint integration –Complete Search Centre More community behind High cohesion between Sharepoint and Search Services Presentation Title - 35

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Next Steps Integration with Drupal –Customized pre-processing, processing, index and query Index SSO Centrally Manage Sites –Own SSO Crawling, Get ACLs, processing Continue evolving the new system –Take advantage all FS4SP features Office WebApps, Visual Refiners, phonetic search,... –Together with content providers improve Relevancy, Best Bets,... Presentation Title - 36

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS CERN Search: and also via: –CERN Intranet & Public Pages –TWiki –IT, HR, PH Websites –JACOW CERN CERN Search - 37

CERN IT Department CH-1211 Geneva 23 Switzerland t OIS CERN Search - 38