Download presentation
Presentation is loading. Please wait.
Published bySuryadi Suharto Setiawan Modified over 6 years ago
1
Elastic Database Capabilities with Azure SQL DB
Silvia Doomra Azure SQL DB Program Management Folks in the room has seen Elastic DB Pools. We have number of capabilities under elastic db umbrella including pools and other capabilities. The focus of this session is all the other capabilities they will need. They are COMPLIMENTARY. Even if they don’t use pools, they can still use these capabilities.
2
Elastic Database Tools & Services - Goals
Simplify the creation and operation of SaaS solutions in Azure that grow to use large numbers of databases Develop OLTP applications with scaled out data tiers in Azure SQL DB Scale (grow or shrink) Azure SQL DB resources as needed Manage operations over many Azure DB databases #1 Write the application when you have 1000s of dbs sitting around. #2 When I don’t have the option to scale up or down, then how do I scale out? #3 How do I management operations on all these dbs? We have helped many Saas application prominently migrating to SQL Azure.
3
Sharding and Tenancy Models
Single tenant per database Each tenant’s data is stored in a different database Better isolation of tenants as compared to multi-tenant model Multiple tenants per database Multiple tenants share the same database Less isolation of tenants as compared to single tenant model Hybrid model Some tenants share databases, others get their own database E.g., premium or paying customers get their own databases, while free tier customers share databases Single Tenet: When a new tenet comes, you add the data into the new db Recommended whenever you can use – because of simplicity and manageability Pros: Better isolation Point in time restore on tenet by tenant basis – other tenets wont be affected Secure – create diff users for diff tenets and dbs Cons: Size and Granularity of tenet is small. As compared to overhead – from economics perspective doesn’t make sense. Multi-Tenet Less Isolation Point in time restore are not easy As tenets grow, as you reach capacity, you need to move tenets to different location. Hybrid Challenge/Question with any of these models is how do you keep track of the dbs that you have in these tenecy model?
4
Elastic scale for your data tier
Shard map management: Manage membership in a scaled out data tier Two types of shard maps Range: contiguous values List: explicit values Types of sharding keys INT, BIGINT, GUID, VARBINARY All datetime types [shardmaps_global] smid name 1 RangeShardMap [shards_global] sid smid Datasource Databasename 1 serverName DB2 2 DB3 [shard_mappings_global] mid smid min max Sid 1 100 2 200 Shard Map Manager That’s where we provide first set of capabilities. We have elastic db client libraries SMM: Keep track of memberships. How are individual tenets mapped to individual dbs Different types of keys you can use to drive the data distributions and mapping? That’s the first challenge of keeping the database tracking in the mappings Metadata for the library is stored in a SQL DB database that you create SSM then creates its internal tables there. DB1 [0-100) DB2 [ ) DB3 [ ) DB4 [ ) DB5 [ ) DB6 [ ) DBn [n-n+100) . . .
5
Data Dependent Routing
Scenario: query a shard with a specific shardlet key/tenant-id Shard Map Manager Client App DDR APIs ( ) Application Developer Admin/ DevOps SELECT * FROM customers WHERE customer ID = 104 The next challenge is to route the request for transaction in the correct db where the data is stored. Mental picture A customer logins through web browser, connects to app Library lives at the app DB1 [0-100) DB2 [ ) DB3 [ ) DB4 [ ) DB5 [ ) DB5 [ ) DBn [n-n+100) . . .
6
Data Dependent Routing
// Get a routed connection for a given shardingKey using (SqlConnection conn = ShardMap.OpenConnectionForKey( shardingKey, connectionString /* Credentials Only */ , ConnectionOptions.Validate /* Validate */ )); { using (SqlCommand cmd = new SqlCommand() cmd.Connection = conn; cmd.CommandText = "SELECT dbNameField, TestIntField, TestBigIntField FROM ShardedTable"; SqlDataReader sdr = cmd.ExecuteReader(); // Now consume results from the data reader… } The method ShardMap.OpenConnectionForKey(key, connectionString, connectionOptions) returns an ADO.Net connection ready for issuing commands to the appropriate database based on the value of the key parameter. Shard information is cached in the application by the ShardMapManager, so these requests do not typically involve a database lookup against the Global Shard Map database. The key parameter is used as a lookup key into the shard map to determine the appropriate database for the request. The connectionString is used to pass only the user credentials for the desired connection. No database name or server name are included in this connectionString since the method will determine the database and server using the ShardMap. The connectionOptions enum is used to indicate whether validation occurs or not when delivering the open connection. ConnectionOptions.Validate is recommended. In an environment where shard maps may be changing and rows may be moving to other databases as a result of split or merge operations, validation ensures that the cached lookup of the database based on a key value is still correct. Validation involves a brief query to the local shard map on the target database (not to the global shard map) before the connection is delivered to the application. If the validation against the local shard map fails (indicating that the cache is incorrect), the Shard Map Manager will query the global shard map to obtain the new correct value for the lookup, update the cache, and obtain and return the appropriate database connection. The only time that ConnectionOptions.None (do not validate) is acceptable occurs when shard mapping changes are not expected while an application is online. In that case, the cached values can be assumed to always be correct, and the extra round-trip validation call to the target database can be safely skipped. That may reduce transaction latencies and database traffic. The connectionOptions may also be set via a value in a configuration file to indicate whether sharding changes are expected or not during a period of time. * ShardMap is the representation of your shard map metadata, and all the mappings.
7
Data Dependent Routing
Caching: improve performance of shard operations Global Shard Map (GSM) – state of all shards in the Shard Map Local Shard Map (LSM) – state of all shards on a particular shard Client Cache (eager/lazy) – state of all shards in the Shard Map/known shards Shard Map Manager Client App DDR APIs ( ) The key question is the round trip that you have to make to Shard map manager to get the shard information. That can be costly in terms of latency. Can become a bottleneck and single point of failure. In the client library, the Shard Map Manager is a collection of shard maps. The data managed by a ShardMapManager .Net object is kept in three places: Global Shard Map (GSM): When you create a ShardMapManager, you specify a database to serve as the repository for all of its shard maps and mappings. Special tables and stored procedures are automatically created to manage the information. This is typically a small database and lightly accessed, but it should not be used for other needs of the application. The tables are in a special schema named __ShardManagement. Local Shard Map (LSM): Every database that you specify to be a shard within a shard map will be modified to contain several small tables and special stored procedures that contain and manage shard map information specific to that shard. This information is redundant to the information in the GSM, but it allows the application to validate cached shard map information without placing any load on the GSM; the application uses the LSM to determine if a cached mapping is still valid. The tables corresponding to the LSM on each shard are in schema __ShardManagement. Application cache: Each application instance accessing a ShardMapManager object maintains a local in-memory cache of its mappings. It stores routing information that has recently been retrieved. GSM Application Developer Cache DB1 [0-100) LSM
8
Elastic Scale Connection Opening Flow
2/22/2019 Elastic Scale Connection Opening Flow ShardMap.OpenConnectionForKey( 104 /* Tenant ID */ , “…” /* Credentials Only */ , ConnectionOptions.Validate /* Validate */ )); OpenConnectionForKey call with validation on Check for shardlet key in cache Cache miss: Fetch mapping info from GSM Connect to shard Validate on shard Validation fails: Go back to cache miss Shard Map Manager Client Application First time when the connection, the call comes, we are going to look for that tenant id in the cache. Lets assume we don’t find it. We go to SMM and fetch the mapping and populate into the cache. Then we use it to open the connection to right shard. We have local copy in the shard and we validate against that local copy using the stored procedure we put on the shard. You will find a number of spValidate SP on your shard. We are calling these SPs to find out of anything has happened between the time we looked up the cache and the actual connection was opened. If validation fails, we have a miss and we go back. And if it succeeds we have a connection to the shard. Cache GSM [100, 200): DB2 DB2 [100, 300) spValidate LSM © 2015 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
9
Multi-shard Query Scenario: execute a query across a set of shards (returns a UNION ALL result set) Shard Map Manager Client App MSQ APIs ( ) Application Developer Admin/ DevOps SELECT count(*) FROM customers UNION ALL result set Multi-shard querying is used for tasks such as data collection/reporting that require running a query that stretches across several shards. (Contrast this to data-dependent routing, which performs all work on a single shard.) DB1 [0-100) DB2 [ ) DB3 [ ) DB4 [ ) DB5 [ ) DB5 [ ) DBn [n-n+100) . . .
10
Multi-shard Query using (MultiShardConnection conn = new MultiShardConnection(m_shardMap.GetAllShards(null), MultiShardTestUtils.GetTestSqlCredential())) { using (MultiShardCommand cmd = conn.CreateCommand()) cmd.CommandText = "SELECT dbNameField, TestIntField, TestBigIntField FROM ShardedTable"; cmd.CommandType = CommandType.Text; cmd.Policy = MultiShardPolicy.PartialResults; using (MultiShardDataReader sdr = cmd.ExecuteReader(includeShardNameColumn: true)) while (sdr.Read()) var dbNameField = sdr.GetString(0); var testIntField = sdr.GetFieldValue<int>(1); var testBigIntField = sdr.GetFieldValue<Int64>(2); string shardIdPseudoColumn = sdr.GetFieldValue<string>(3); } The main entry point into multi-shard querying is the MultiShardConnection class. As with data-dependent routing, the API follows the familiar experience of the System.Data.SqlClient.aspx) classes and methods. With the SqlClient library, the first step is to create a SqlConnection, then create a SqlCommand for the connection, then execute the command through one of the Execute methods. Finally, SqlDataReader iterates through the result sets returned from the command execution. The experience with the multi-shard query APIs follows these steps: Create a MultiShardConnection. Create a MultiShardCommand for a MultiShardConnection. Execute the command. Consume the results through the MultiShardDataReader. A key difference is the construction of multi-shard connections. Where SqlConnection operates on a single database, the MultiShardConnection takes a collection of shards as its input. One can populate the collection of shards from a shard map. The query is then executed on the collection of shards using UNION ALL semantics to assemble a single overall result. Optionally, the name of the shard where the row originates from can be added to the output using the ExecutionOptions property on command. The following code illustrates the usage of multi-shard querying using a given ShardMap named myShardMap. Note the call to myShardMap.GetShards(). This method retrieves all shards from the shard map and provides an easy way to run a query across all relevant databases. The collection of shards for a multi-shard query can be refined further by performing a LINQ query over the collection returned from the call to myShardMap.GetShards(). In combination with the partial results policy, the current capability in multi-shard querying has been designed to work well for tens up to hundreds of shards. A limitation with multi-shard querying is currently the lack of validation for shards and shardlets that are queried. While data-dependent routing verifies that a given shard is part of the shard map at the time of querying, multi-shard queries do not perform this check. This can lead to multi-shard queries running on databases that have since been removed from the shard map.
11
Getting Started is Easy!
5 minute experience to a running app in Visual studio! Open Nuget Page
12
Demo Elastic Database Tools
13
Data Movement: Split/Merge
Scenario: perform a split or merge action Split: create two distinct shards from one Merge: create one shard from two distinct shards Customer Hosted Services (SM) Application Developer Admin/ DevOps Split Merge DB1 [0-100) DB2 [ ) DB3 [ ) DB4 [ ) DB5 [ ) DB5 [ ) DBn [n-n+100) Cloud Service package that they need to drop in their Azure subscription Components of Split-Merge : Client library Web User Interface . . . DB2.1 [ ) DB5.1 [ ) DB5.2 [ )
14
Elastic DB Query PowerBI OLTP Cloud Application SQL DB Elastic Query
SQL TDS, ODBC, JDBC, ADO SQL DB Elastic Query Elastic Tools Libraries Azure SQL DB v12 Left side when you can make changes to the application by using the client library. What do you do in cases where you don’t want to or can make changes to the application? Or you want to use a tool to connect to a single db but not many dbs GQ provides an abstraction that looks and smells like a single db but under the cover knows how to scale out to multiple dbs. DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB Azure SQL Database
15
Demo Elastic Database Query
16
Elastic Database Jobs Scenario: Perform management operations across many DBs (e.g. index maintenance, DDL and DML, even queries delivering merged results) Shard Map Manager DB Results DB Control DB Admin/ DevOps Database Jobs (cloud service) Scenarios that require async processing User Powershell or Portal to define db jobs and execute them. DB1 [0-100) DB2 [ ) DB3 [ ) DB4 [ ) DB5 [ ) DB5 [ ) DBn [n-n+100) . . .
17
Elastic Database Jobs: Common Use Cases
2/22/2019 Elastic Database Jobs: Common Use Cases Make changes to many databases Deploy schema changes across many databases, using T-SQL Update data common to many databases, e.g., reference data Deploy new versions of stored procedures Maintain indexes across a set of databases Manage permissions and logins across a set of databases Collect results from many databases Collect database telemetry from DMVs to monitor data tier performance Gather application-specific metrics and KPIs Robust script deployment mechanism Makes it easy to use and manage a large set of databases Built-in concurrency High efficiency – execute script on many databases in parallel Execution history and script versioning Secure – encrypted credential storage and rich execution tracking Built-in retry logic Failure tolerance – automatically retry over connection or other errors Low Barrier to Entry Powershell: installation script creates all required artifacts Portal: facilitates common use case management operations Direct T-SQL Invocation: enables customers to script & automate © 2015 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
18
Database Jobs – Capabilities
Summary of key capabilities Execute T-SQL scripts – on-demand, scheduled Asynchronous, parallel (configurable) execution across databases Monitor execution progress and review status Automatic retry in case of failures Operations across collections of Azure SQL Databases Elastic Pools Shard Sets All DBs on a Server Ad-hoc collections of DBs Use stored secured credentials
19
Elastic Database Transaction
2/22/2019 Elastic Database Transaction The next challenge is to route the request for transaction in the correct db where the data is stored. © 2015 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
20
Demo Elastic Database Transactions 2/22/2019
© 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
21
Elastic Database Transaction
using (var scope = new TransactionScope()) { using (var conn1 = new SqlConnection(connStrDb1)) conn1.Open(); SqlCommand cmd1 = conn1.CreateCommand(); cmd1.CommandText = string.Format("insert into T1 values(1)"); cmd1.ExecuteNonQuery(); } using (var conn2 = new SqlConnection(connStrDb2)) conn2.Open(); var cmd2 = conn2.CreateCommand(); cmd2.CommandText = string.Format("insert into T2 values(2)"); cmd2.ExecuteNonQuery(); scope.Complete(); The following sample code uses the familiar programming experience with .NET System.Transactions. The TransactionScope class establishes an ambient transaction in .NET. (An “ambient transaction” is one that lives in the current thread.) All connections opened within the TransactionScope participate in the transaction. If different databases participate, the transaction is automatically elevated to a distributed transaction. The outcome of the transaction is controlled by setting the scope to complete to indicate a commit.
22
Dynamic Management Views (DMV)
2/22/2019 Dynamic Management Views (DMV) All DMVs related to transactions are relevant for distributed transactions in Azure SQL DB. sys.dm_tran_active_snapshot_database_transactions sys.dm_tran_active_transactions sys.dm_tran_current_snapshot sys.dm_tran_current_transaction sys.dm_tran_database_transactions sys.dm_tran_locks sys.dm_tran_session_transactions sys.dm_tran_top_version_generators sys.dm_tran_transactions_snapshot sys.dm_tran_version_store © 2015 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
23
Elastic DB Tools: Summary of Capabilities
Client .NET APIs Management Services Shard map management (SMM) Define groups of shards for your application Manage mapping of routing keys to shards Data dependent routing (DDR) Route incoming requests to the correct shard, e.g., given a customer ID Ensure correct routing as tenants move Cache routing information for efficiency Multi-shard query (MSQ) Interactive processing across several shards Same statement executed on all shards with UNION all semantics Elastic DB Transactions Support for distributed transactions across Azure SQL DBs Database Jobs (DJ) Asynchronous processing across several databases Split/Merge (SM) Grow or shrink capacity by adding or removing scale units Easily move data between scale units Querying Capabilities Elastic Queries (EQ) Interactive reporting across several shards Rich query semantics beyond MSQ Familiar SQL DB connection experience and protocol support
24
Roadmap & Next Steps Elastic Scale APIs and customer-deployed services
Client Library and Split/Merge: Generally Available Open Sourced at: Elastic Database Jobs: currently in Preview Elastic Database Query: currently in Preview Elastic Database Transactions: Generally Available
25
Thanks! Silvia Doomra-
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.