Michael Rys, Program Manager SQL Server @SQLServerMike Blazing-Fast Performance: Explore FileTables, Full-Text Search and Semantic Search for Faster, More Meaningful Results Across Unstructured Data Michael Rys, Program Manager SQL Server @SQLServerMike
Rich Unstructured Data in SQL Server 2012 Address important customer requests for Capabilities and Rich Services for Rich Unstructured Data (RUDS) Scale Up storage and search to 100M to 500M documents Easy use/access to unstructured data from all applications Rich insight into unstructured data to make better decisions
Integrated Management of documents in SQL Server 2012 9/16/2018 3:45 PM Integrated Management of documents in SQL Server 2012 DEMO © 2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
FILETABLE details DB Level option for Non-transactional Access to File data Off by Default Applies only to Win32 access T-SQL access is fully transactional CREATE TABLE DocumentStore AS FileTable WITH (FileTable_Directory= N'Document‘); a Table of Files/Directories Table with a fixed schema Contains FILESTREAM and File Attributes Each row represents a File or a Directory System defined constraints maintain the tree integrity Table level permissions for files/directories in a FileTable (SQL security) Supported by AlwaysOn Full concurrency control T-Sql access, Win32 access and between the two Exposes File/Directory hierarchy through a Windows Share Supports Win32 APIs for File/Directory Management DB Storage is Transparent to Win32 applications SMB level of application compatibility Virtual network name (VNN) path support for transparent Win32 application failover
Full Text Search IMPROVEMENTS IN SQL SERVER 2012 Improved Performance and Scale: Scale-up to 350M documents for storage and search iFTS query perf 7-10 times faster than in SQL Server 2008 Worst-case iFTS query response times < 3 sec for corpus Be similar or better than main database search competitors New Functionality: Property Search customizable NEAR Innovation in Search: Semantic similarity Search
---------------------------------------------------------------------- Semantic Similarity Input: Text such as varchar, Office, PDF, HTML, email… Output: Rowset functions with standard SQL queries Illustrating example: Source Table Keyphrases KeyphraseDocuments Key Title Document D1 Annual Budget … D2 Corporate Earnings D3 Marketing Reports ---------------------------------------------------------------------- ID Keyword T1 revenue T2 growth T3 Windows T4 Azure … ID DocID T1 (revenue) D1 (Annual Budget) T2 (growth) D2 (Corporate Earnings) T3 (Windows) D3 (Marketing Reports) … D7 (Finance Report) D11 (Azure Strategy) T4 (Azure) 1 2b Full-Text and Semantic Processing quarter, record, revenue… + Language Models 3 2a DocumentSimilarity Keyword Index (Full-Text) DocID MatchedDocID D1 (Annual Budget) D2 (Corporate Earnings) D7 (Finance Report) D3 (Marketing Reports) D11 (Azure Strategy) … ID Keyword Colid … compDocid CompOc CompPid K1 revenue 1 10,23,123 (1,4),(5,8),(1,34) 2,5,6,8,4,3 K2 growth (1,5),(5,9),(1,34) 2,5,6,8,5,4 3
9/16/2018 3:45 PM Semantic Extraction and Relationships FullText Search in SQL Server 2012 DEMO © 2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Key Takeaways SQL Server’s unstructured data support is targeting non-traditional database workloads that are growing rapidly in the enterprise. Example: Content and Collaboration apps SQL Server’s unstructured data support is targeting key ISV asks in fast growing markets: Examples: eDiscovery, Healthcare, Document management etc. Beyond Relational is key strategy to enable you to build complex data applications that go beyond relational data!
Related Content SQL Server 2012 Whitepapers and information: Tech Ed North America 2010 9/16/2018 3:45 PM Related Content SQL Server 2012 Whitepapers and information: http://www.sqlserverlaunch.com Channel 9 DataBound Episode 2: http://channel9.msdn.com MySemanticsSearch Demo: http://mysemanticsearch.codeplex.com More demo data sets and demo scripts: http://blogs.msdn.com/b/sqlfts/archive/2011/07/21/introducing-fulltext-statistical-semantic-search-in-sql-server-codename-denali-release.aspx Find Me Later At… On Twitter: @SQLServerMike Blog: http://sqlblog.com/blogs/michael_rys © 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Further. Forward. Faster. Learn more Visit the Microsoft Virtual Academy Free technical SQL Server training courses Get more Download SQL Server 2012 Trial Software 180-day evaluation of SQL Server 2012 Do more Participate in virtual launch activities and collect points for prizes The more points you earn, the bigger your prize could be
Appendix
Managing FileTable – Trouble shooting DMV to show all open non-transactional file handles sys.dm_filestream_non_transact_handles Stored Procedure to terminate open file handles sp_kill_filestream_non_transacted_handles X-events/Perf counters for trouble shooting
Full-Text Property Search --needed once per database instance to load the office filters exec sp_fulltext_service 'load_os_resources',1 go exec sp_fulltext_service 'restart_all_fdhosts' -- use property search CREATE SEARCH PROPERTY LIST p1; ALTER SEARCH PROPERTY LIST [p1] ADD N'System.Author' WITH (PROPERTY_SET_GUID = 'f29f85e0-4ff9-1068-ab91-08002b27b3d9', PROPERTY_INT_ID = 4, PROPERTY_DESCRIPTION = N'System.Author'); ALTER FULLTEXT INDEX ON fttable SET SEARCH PROPERTY LIST = p1 Go -- Querying DOCUMENT HAVING “John Doe” AS AUTHOR ONLY select * from fttable where contains(property(ftcol, 'System.Author'), '"John Doe"')
Full-Text Customizable Near --OLD NEAR SYNTAX select * from fttable where contains(*, 'test near Space') --NEW NEAR USAGES: --SPECIFY DISTANCE select * from fttable where contains(*, 'near((test, Space), 5,false)') --REDUCE DISTANCE where contains(*, 'near((test, Space), 2,false)') --ORDER OF WORDS IS SPECIFIED AS IMPORTANT where contains(*, 'near((test, Space), 5,true)')