Download presentation
Presentation is loading. Please wait.
Published byHoward Cameron Modified over 9 years ago
1
Introduction to Full-Text Searching in SQL Server 2012 Adolfo J. Socorro, Ph.D. IT Impact, Inc. asocorro@itimpact.com
2
Outline What can we do with FTS? How to install FTS FTS components Creating FTS indexes How to query with FTS FILESTREAM and FileTable
3
FTS Basics char varchar nchar nvarchar text ntext image xml varbinary varbinary(max) FTS allows searching against character-based data
4
Search Functionality “hotel” => “hotel” Specific words or phrases “fan” => “fantastic”, “fantasy” “local store” => “locally stored” Prefixes “minimized” => “minimizing”, “minimise” Inflectional forms
5
Search Functionality “search,query” => “query to perform search” Proximity “folder” => “directory” SynonymsWeighted Values
6
A First Look Let’s run some simple examples to get a feel for FTS!
7
LIKE vs FTS LIKE works on character patterns only Cannot use the LIKE predicate to query formatted binary data FTS is much faster against large amounts of unstructured text data
8
Supported SQL Server Editions Enterprise Business Intelligence Standard Web Express with Advanced Services Available since at least SQL Server 2000
9
FTS Components Word Breaker StemmerStoplists ThesaurusFilters Property Lists
10
Language Support 50+ languages Language-specific components Word breakers and stemmers Stoplists Thesaurus files
11
How to Install
12
Default FTS Language
13
FTS Indexes One index per table or indexed view Must have a unique, single-column, non- nullable index on the table Grouped within the same database into one or more full-text catalogs (“containers”)
14
Full-Text Catalogs A logical construct A way to manage FT indexes together
15
Index Population Population: the addition of data to full-text indexes Automatic Manual On Request Scheduled
16
Steps to Setup an Index on a Table Create Full-Text Catalog For Each Column to Index Indicate language Indicate document type * Choose Change-Tracking Mechanism
17
Full-Text Index Wizard
18
Example: Create Catalog and Index
19
CONTAINS Precise or prefix matches to single words and phrases Proximity matches Logical operations between conditions: AND, OR, AND NOT Optional use of inflectional forms and thesaurus
20
FREETEXT Matching the meaning, but not the exact wording, of specified words or phrases Always uses inflectional forms and thesaurus
21
CONTAINSTABLE AND FREETEXTTABLE Return a relevance ranking value (RANK) and full-text key (KEY) for each row The actual RANK values are unimportant and typically differ each time the query is run ISABOUT/WEIGHT influence the ranking in CONTAINSTABLE
22
Example: Queries
23
Stoplists A mechanism to discard commonly occurring strings that do not help the search aisthe byand…
24
Thesaurus Nicknames: Robert/Bob Common misspellings: calendar/calender Homophones: Geoff/Jeff Technical terms: proc/procedure Very powerful if you log searches and learn what users are commonly searching for
25
Thesaurus One file per language Expansions “bike” in addition to “bicycle” Replacements “calendar” instead of “calender”
26
Filters Extract textual information from the document (removing the formatting) Send the text to the word-breaker component for the language associated with the column Need to manually install Office 2010 and PDF filters
27
Example: FTS Components
28
Where to Store Large Objects? DatabaseFile System security manageability, recoverability transactional consistency performance
29
Why Store in the Database? Integrating unstructured data into the relational database provides significant benefits: Integrated storage and data management capabilities (e.g., backup) Ease of administration and policy management Full-text search
30
FILESTREAM A database/file system hybrid FILESTREAM is an attribute that can be assigned to a varbinary(max) column Allows storing BLOB data in the file system Not restricted to the 2 GB limit SQL Server imposes on BLOBs
31
FILESTREAM SQL Server buffer pool is not used Isolation semantics are governed by Database Engine transaction isolation levels
32
Steps to FILESTREAM Enable at OS levelConfigure at instance levelCreate a filegroupAdd a file to the filegroup Indicate root folder
33
OS-level Configuration of FILESTREAM
34
Instance-level Configuration of FILESTREAM
35
Example: FILESTREAM
36
FILESTREAM All data access must be transactional Must use specific APIs for file I/O Do not edit the files directly!
37
When to Use FILESTREAM Objects that are being stored are, on average, larger than 1 MB Store smaller objects in the database Fast read access is important You are using a middle tier for application logic
38
FileTables A special, fixed-schema kind of table Builds on top of existing FILESTREAM capabilities Store files and documents in in the database, but access them from Windows applications as if they were stored in the file system (WIN32 API)
39
FileTables Hierarchical namespace Includes file system properties as columns Preserves full file names Non-transactional access through the FS
40
FileTables Calls to create or change a file or directory through the Windows share are intercepted by a SQL Server component and reflected in the corresponding relational data in the FileTable
41
Example: FTS over FileTables
42
FileTables vs FILESTREAM File and directory hierarchy maintained in the database Windows application compatibility Relational access to file attributes Both are available in all editions
43
Wrap Up Advanced searching on character-based data, including documents FTS setup, components, and queries FILESTREAM FileTables
44
Other Topics Document-property search Semantic search Optimizations Query plans and execution traces
45
References Posts and presentations by Bob Beauchemin http://www.sqlskills.com/blogs/bobb/ Blog: SQL Server FTS Team Blog http://blogs.msdn.com/b/sqlfts SQL Server 2012 Books Online http://msdn.microsoft.com/en- us/library/cc645577(SQL.110).aspx
46
Filter Packs Adobe PDF Filter http://www.adobe.com/support/downloads/thankyo u.jsp?ftpID=4025&fileID=3941 Office 2010 Filters http://www.microsoft.com/en- us/download/details.aspx?id=17062
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.