SQL Server and SharePoint Best Frienemies Lisa Gardner Premier Field Engineer LisaG@microsoft.com http://blogs.msdn.com/sqlgardner @sqlgardner
Who am I ? What is PFE? Lisa Gardner aka SQLGardner Central Ohio native Working with SQL since 6.5 http://blogs.msdn.com/sqlgardner @SQLGardner Premier Field Engineering Reactive and Proactive support for Premier customers Architecture/Project Guidance Team Mentoring Deliver workshops Troubleshooting
Table of Contents SharePoint Overview SharePoint Databases Configuration, Setup, and Maintenance What to Look Out For
Understanding the Application. SharePoint Overview Understanding the Application.
SharePoint Glossary WFE Web Application Service Application Site Collection ULS Logs Timer Jobs
SharePoint Web Architecture Farm Web Application Site Collection Site List Item There is a form of authorization at web app level when you create "web application policies". You can grant full read, full control, deny read, and deny all at the web app level. Not the same authorizatoin as lower levels
Key Attributes Web Application Different IIS Site Created on each WFE Isolates Content Provides authentication mechanism Site Collection Container of Sites Quotas Decentralized Content Administration Also serves as a site Site Permission Inheritance Can Share layout and data with other sites Can provide unique feature set from other sites
Service Applications Provides granular pieces of functionality Some can be tied to a specific server Offers scalability, load balancing, fault tolerance for most services Many to many relationship with web applications and service applications Each web application can have a unique set of service applications
Timer Jobs SharePoint equivalent to SQL Agent OWSTimer - Windows service for SharePoint 2010 at a predefined schedule Uses same logging infrastructure as web tier Includes Correlation IDs Jobs can be nested
SharePoint Internal Data Logging Database ULS Logs Correlation IDs
Logging Database Stores all SharePoint usage and health data ULS trace log data Event log data Blocking SQL Queries Crawl and Query Statistics Feature Usage Page Requests +More Informational and Verbose categories in ULS logs will NOT get uploaded to the database
Logging Database Table partitioning for ease of use and custom views Tables partitioned for performance Views for usability Custom reports through Excel or Excel web parts Due to table partitioning, it is easier to use the provided views, as shown in the following screenshot, or create your own custom queries when viewing data in the logging database. You can use Excel with the Excel Web App to create quick custom reports that are hosted from within SharePoint. Note that the partitioning is done by using separate tables, not using partitioned tables. Some out-of-the-box reports make use of the logging database. This is the ONLY database that you can query directly! The logging database is tuned to support a heavy load of simultaneous writes. This has been tested up to 5000 transactions/second in parallel, and as such, where possible, this database should be put on to a separate disk spindle. The logging database can be used in a number of scenarios, most of which relate to troubleshooting, but also include usage reporting. These include: Poor crawl or query performance SQL queries that are causing blocking Timer jobs that are regularly failing Determining how widely a feature is actually used Listing all site collections within the farm for reporting or billing purposes Basic configuration settings for the Usage and Health Data Collection service application, and as such the logging database, can be set from within Central Administration. These can be divided into two categories, usage data and health data. Some out-of-the-box reports use this data
ULS Logs Also referred to as trace logs Must be consistent location on all farm servers Over 260 categories of events Nine columns including: Timestamp, Process, Category, EventID, Level, Message, and Correlation Unified Logging Service (ULS) logs are a robust logging system first introduced in MOSS. By default, ULS logs contain events from all categories; there are over 266 categories of events in all. The following table lists some of the common events that you will use for troubleshooting SharePoint. The ULS logs, also referred to as trace logs in the UI, are created by the ULS. These logs are stored in the %CommonProgramFiles%\Microsoft Shared\web server extensions\14\LOGS folder with the name format of ServerName-Date-Timestamp.log. You can change the location of these log files by using the http://CentralAdminSite:Port/_admin/metrics.aspx page. However, the location must be consistent across all servers in your farm. A ULS log file contains nine columns: Timestamp, PID, TID, Product, Category, EventID, Level, Event Message, and Correlation. Flood Protection can be enabled. Enabling this setting configures the system to detect repeating events in the Windows event log. When the same event is logged repeatedly, the repeating events are detected and suppressed until conditions return to a typical state. Logging Levels: None, Critical, Error, Warning, Information, Verbose
Correlation IDs Generated for every request Logged from the start of a request through to the end Useful for troubleshooting and tracing On error pages, ULS logs, Windows Logs, SQL Traces ULS ULS 7d25d051-ca73-43 … ~~~~~~~~~~~~~ Correlation IDs are a globally unique identifier (GUID) that is automatically generated for every request received by SharePoint. Correlation IDs are logged from the start of a request through to the end and all events that belong to the request share the same correlation ID. These IDs are very useful for troubleshooting and tracing requests. Correlation IDs are shown on error pages, in ULS trace logs, Windows logs, on the Developer Dashboard, and even in SQL traces. Consider a scenario where you have a farm with multiple servers. Excel is hosted on a server different than the WFE server. As this conversation happens across layers, you can experience the whole conversation in the trace logs by tracing and filtering on the correlation ID, as shown in the graphic on the slide above. You can find the correlation IDs across servers and all trace logs with the Get-SPLogEvent cmdlet as follows: 7d25d051-ca73-43 … 7d25d051-ca73-43… 7d25d051-ca73-43 … Web Front-End Server Application Server Get-SPLogEvent | Where-Object {$_.Correlation -eq "f5bbb9dc-0f92-41b3-9ae9-8487352dcf0e"}
So Many Databases, So Little Time SharePoint Databases So Many Databases, So Little Time
Configuration and Admin Content Databases Farm Configuration Store Objects Table – Serialized Objects Binaries Table – Farm Solution Store SiteMap Table – Links a site into the configuration Content Database for Central Admin is a Content DB with very specific templates - considered to be an extension of the configuration database Backup and Recovery It is Supported to back up this database It is Not Supported to restore unless the farm is fully stopped when the backup is taken
Configuration and Admin Content Databases General Recommendations Default recovery model is Full but in most cases this database should be run in simple recovery mode Initial Data File Size: 2GB is appropriate for most situations Config databases are typically smaller and do not get much load Mirroring Supported to mirror within the farm (partner on same network as primary) Not Supported to mirror asynchronously or to log ship over WAN http://technet.microsoft.com/en-us/library/dd207314.aspx
Content Database Stores all site data in a site collection Site Metadata Web Part Pages Files uploaded to document libraries List Items Security Solutions It is supported to Mirror in Farm for High Availability It is supported to Mirror Asynchronously or Log Ship over WAN for disaster recovery General Recommendations Run in Full recovery mode only if the site data requires point in time restores Tip: put content DBs in simple recovery mode during upgrades / patches
Content Database Schema Why SharePoint seems so crazy. Container Tables Sites Id Quota Other Metadata Webs SiteId Id Url Title ScopeId Metadata AllLists WebId Id Title ItemCount ScopeId Fields Metadata Namespace Table Url DirName LeafName WebId Id SiteId ListId DoclibRowId Other Metadata Core schema userdata table - block of nvar char stores text and is reused across multiple types - different logical fields in same columns SharePoint translates a logical query into a physical query by understanding this structure namespace table - maps the folder and file hierarchy container tables - holds the site mapping hierarchy tables ==================================== How do we index this bad boy Can’t use traditional SQL indexes to index table - thousands of types - would need 1000's of sql indexes Index is polluted by the many different datatypes stored in the same SQL type column - statistics will be invalid So - we create a Name Value Pair table in SQL and index it using classic SQL Indexes (Effectively an application level index) - let SharePoint use SQL server to make SQL locate the data it needs How does the SQL Query Processor know what the best query is when it doesn’t understand the data. We join from a SQL indexed Name Value Pair table to the SharePoint content table - has overhead on small lists but avoids tablescans on larger lists and improves performance we therefore dictate the Query Execution Plan based on what SharePoint knows to avoid instability and performance issues due to complex non-index driven queries SQL is a very capable data management platform but SharePoint has put unique stresses on sql that have not always had the best results Used to be one table per list, one db per site --doesn’t scale as you soon get 1000’s of tables within sql server which causes management overhead --Security was a problem. Trying to delete a user cause sql to have to check every table for any reference to that user. On servers with lots of lists you would see a stack overflow as sql tried to manage the referential integrity. Many of the new changes in sql server were a response to a SharePoint need (ie Sparse Columns, filestream, wide tables, partitioned indexes, etc). Userdata table 1…64 1...32 1..8 1..16 1..12 1..8 1…16 ~35 sql_variant int float nvarchar ntext datetime bit Other metadata
Content Database Layout Can contain 1-2000 site collections Scale out at the db level and the instance level. Sizing Guidance <200GB Maintenance tasks stay manageable Makes db movement and DR easier Plan for 2 IOPs per GB data Can have 200GB-4TB if .25 IOPs per GB Size and load depends on the sites they contain Separate very active sites into different site collections/content dbs Can have 32,767 dbs per instance, but recommend 200 per instance as manageability can be an issue 300 DBs per Web Application
Service Application DBs Search Admin Crawl Property Profile Syncronization Social Tagging Web Analytics Reporting Staging Logging BDC State Secure Store Reporting Services Power Pivot Project Server Performance Point
Service Application Databases Performance Considerations The different service application dbs have a wide variety of performance/sizing considerations. Write-Intensive dbs Usage and Health Data Collection database (Logging) Web Analytics Reporting database (during load) Search service application Crawl database (during crawls) Search service application Property database User Profile service application Synchronization database
Service Application Databases Performance Considerations Cont’d Read Intensive DB’s Web Analytics Reporting database Search service application Crawl database User Profile service application Profile database User Profile service application Synchronization database User Profile service application Social Tagging database Reporting database (Project Server)
Database Scale Out Guidance Search Content Content Content Content Content Content Logging Web Analytics Other Can vary greatly depending on usage patterns Typically scale out Search first Then isolate content databases… adding additional dbs to same instance first before moving to dedicated instance Logging and Web Analytics More instances for content OR Heavily used Service Applciations Admin/ Content
Configuration, Setup, and Maintenance
Planning for SharePoint Setup Allow the SharePoint installer to create databases Modify file sizes and growth settings Rename dbs to remove GUIDs SharePoint setup and admin accounts required roles: DB Creator Security Admin Can be removed for the setup account but will need to be added again for any further installs – not recommended Patching/Service packs Adding a new Service Application Add Service Application account logins Requires db_owner role in DB Various databases have specific settings such as recovery model already set for best practices – keep that as is --you can pre create all the databases, but it makes the SharePoint configuration process a bit more cumbersome – there are plenty of resources online Many of the DBS will get created with GUIDs in the names – name them to may them easily identifiable such as SP2010_Profile_Sync – If SharePoint is configured via PowerShell, the dbs can be created with custom names. DBs are created and the appropriate permissions are given to the correct accounts. Service accounts will require db_owner role in the application databases The requirement for Security Admin is one of the many reasons that it is best to dedicate an instance to SharePoint and not co-host with other application databases. While it is supported to remove it from the setup account, it is not recommended.
Instance Configuration Follow general Best Practices for SQL Configuration Use Latin1_General_CI_AS_KS_WS collation Configure for heavy TempDB usage Multiple data files Data and log files separated/isolated Pre-size data files Set max degree of parallelism to 1 SharePoint overrides with MAXDOP Set max server memory and use Lock Pages In Mem Consider setting fill factor (%) to 80 For the most part follow best practices – not mentioning all of those here that would be a session all by itself Collation – does not have to be the instance collation, but that makes things more manageable – if not instance collation then you must make sure that all databases are created with this collation TempDB - .25 X largest db for the size
Database Configuration Do not use Auto Shrink Set Auto Create Statistics OFF Set Page Verify to Checksum Set Auto Grow sizes to MB not Percent Pre-size for growth Monitor utilization and grow manually! Auto Update – leave it as is – should be off for content databases
Index Maintenance Index Maintenance is extremely important in SharePoint DMV Sys.dm_db_index_physical_stats can be used to report index fragmentation SharePoint 2007 by default would rebuild every index via a Timer Job SharePoint 2010 does a much better job at keeping index fragmentation in check It only rebuilds indexes that are fragmented Updates statistics
Health Analyzer Rules Index defragmentation and statistics maintenance address the following databases: Configuration databases Content databases User Profile: Profile databases User Profile: Social databases Web Analytics Reporting databases Web Analytics Staging databases Word Automation Services databases Search Property/Crawl databases These databases contain proc_DefragmentIndices Run daily
Health Analyzer Rules Cont’d Search Property database Proc_MSS_DefragSearchIndexes Run weekly Crawl database Proc_MSS_DefragGathererIndexes Manual Always report as fragmented Execute this rule after the first full crawl Databases below do not have automated mechanism in place – really isnt a need Search Administration Secure Store State Service User Profile: Sync Usage (Logging) Managed Metadata Business Connectivity Services PerformancePoint Services
Statistics Health Analyzer rules rebuild indexes and update statistics AutoUpdate – off in SP 2010 by default Update manually when: Query execution times are slow After maintenance operations such as table truncation or a large batch insert/update/delete
Why is Index/Stats Maintenance So Important? GUIDs are used as clustered primary keys Random values = unpredictable insert pattern 16 bytes each Heavy insert/update activity These properties lead to rapid index fragmentation due to many page splits Fillfactor helps delay the inevitable but increases space usage SharePoint rebuilds indexes with fillfactor of 80 It’s soapbox time! A GUID column stores 16-byte binary values that operate as globally unique identifiers (GUIDs). A GUID is a unique binary number; no other computer in the world will generate a duplicate of that GUID value. The main use for a GUID is for assigning an identifier that must be unique in a network that has many computers at many sites. FROM BOL The uniqueidentifier data type has the following disadvantages: --The values are long and obscure. This makes them difficult for users to type correctly, and more difficult for users to remember. --The values are random and cannot accept any patterns that may make them more meaningful to users. --There is no way to determine the sequence in which uniqueidentifier values were generated. They are not suited for existing applications that depend on incrementing key values serially. --At 16 bytes, the uniqueidentifier data type is relatively larger than other data types, such as 4-byte integers. This means indexes that are built using uniqueidentifier keys might be relatively slower than indexes using an int key. Consider using the IDENTITY property when global uniqueness is not required, or when having a serially incrementing key is preferred. DEMO: Time Permitting: look at AllDocs table to show all the GUID values and calculation to show small number of rows per page as well as the order of the values to illustrate page splits
What to Look Out For Common issues
New Content Databases Use DBA created content databases! SharePoint hard codes small file size and growth settings Automation Options: Powershell is a great option to allow SP Administrators to create dbs! Have a number of empty DBs already created Must Do’s Use Latin1_General_CI_AS_KS_WS collation Set appropriate recovery model for your recovery needs Add SP farm setup account and service account with db_owner role
Full Crawl Impact When a full crawl is running – it is a very intensive operation that can have an impact on other dbs hosted on that instance – if asked about overall performance slowdown, ask if a crawl is running It is common to see deadlocking in the Crawl database during this time. If size rapidly grows: ask about the depth of crawling links in documents
Ensure Index Maintenance is Running Health Analyzer Rule Definition Databases used by SharePoint have fragmented indices Databases used by SharePoint have outdated index statistics Health Analysis Job in Logging DB Details in ULS logs DEMO: Show the entries in the demo ULS log – search the file in notepad for “fragment” – then open ULSLogViewer.exe and do the same – show the rows where it says that it is scanning each DB
Excessive Blocking Common scenario: “The SQL Server is slow” Ask for ULS Log info Blocking/Deadlocks can be common in content DBs Try a manual update stats Inquire about large lists, dbs over threshold, and other capacity limitations being exceeded Ask about list throttling and “happy hour” Read Committed Snapshot Isolation is not supported SharePoint list throttling. Queries for a list to return > 5,000 items (20,000 for admins or auditors) will not be serviced during normal hours. There is a "happy hour" timeframe that SP Farm Admin can enable afterhours to allow for large queries. Be aware of if they have enabled and what time that is.
Others ASYNC_NETWORK_IO_WAITS Disk IO TempDB Bottleneck Very Large Queries Logging is the ONLY DB to be queried directly ASYNC_NETWORK_IO_WAITS This is SQL Server waiting on the client app to consume the result set Relatively common in SharePoint as many result sets include documents which are large compared to rows of data. Can indicate a network issue IF NetworkInterface:OutputQueueLength is >2 on average Can also indicate WFE or APP server is CPU bound and is not able to consume results quickly Disk IO Due to the very random nature of data access, large rowsize for document data, and rapid index fragmentation, Disk performance is KEY to a high performance SharePoint db infrastructure. Look for PAGEIOLATCH waits. Also view Sys.dm_io_virtual_file_stats TempDB Bottleneck TempDB is very heavily utilized in SharePoint. PAGELATCH contention can happen in certain circumstances. You will see PAGELATCH waits with a wait resource of 2:X:Y which denotes dbid, fileid, page identifier (not pageid) in sys.d LARGE QUERIES DBCC MemoryStatus small/medium/big gateway Show AllUserData table to illustrate how very wide lists can create very complex queries Queries are submitted to the server for compilation. The compilation process includes parsing, algebraization, and optimization. Queries are classified based on the amount of memory that each query will consume during the compilation process. When a query starts, there is no limit on how many queries can be compiled. As the memory consumption increases and reaches a threshold, the query must pass a gateway to continue. There is a progressively decreasing limit of simultaneously compiled queries after each gateway. The size of each gateway depends on the platform and the load. Gateway sizes are chosen to maximize scalability and throughput. If the query cannot pass a gateway, the query will wait until memory is available. Or, the query will return a time-out error (Error 8628). Additionally, the query may not acquire a gateway if the user cancels the query or if a deadlock is detected. If a query passes several gateways, the query does not release the smaller gateways until the compilation process has completed. See KB907877 Querying DBs Directly Seems totally unnatural to the DBA, but except logging, do not query the db directly! There are PowerShell commands for pretty much anything you possibly want to do in SharePoint.
SP2013 Changes Shredded storage to minimize storage needs with versioning Sparse Columns to support wide lists Web Analytics redesigned – more robust/scalable Profile Sync – tests of a 3 week import for 300k users now only takes 7 hours Stretch farms no longer supported – all databases must now reside in the same data center
Helpful Links Know the Limits! http://technet.microsoft.com/en-us/library/cc262787.aspx More info on SharePoint DBs http://technet.microsoft.com/en-us/library/cc678868.aspx http://www.microsoft.com/en- us/download/details.aspx?id=3408
Questions?