Scott Klein Technical Evangelist
Scott Klein
Show the major components necessary to approach designing for scale at the database layer. Demonstrate the different approaches for designing a scalable database layer.
Scale-up Single database that houses all the data of an application Hard to handle peak load OK with exponential incremental cost Scale-Out Multiple databases spread over multiple independent nodes Cost effective, commodity class hardware Typical patterns: Sharding and Horizontal Partitioning
Cloud Applications Require Scale Beyond Scale-Up Demand the Best Economics Best Price/Performance Elasticity + Pay-as-you-go
Single tenant per database Multiple-tenants per database Multiple databases per tenant
A Few Examples Web Scale DB Solutions Multi-tenant Saas ISVs Workloads with Spikes, Bursts, Peaks, etc… NoSQL Applications
Tenant Key Used for all non-reference table records Used by almost all queries (indexes) Int, bigint, GUID Reference Table Int, bigint, GUID Reference Table Lookup table
SalesDB Orders_federation Orders_Fed [5000, 7500) & [7500, 10000) Built-in Data-Dependent Routing (DDR) Ensure apps can discover where the data is just-in-time No “Shard Map” caching Guaranteed member routing
Defining the Tenant Most granular as possible Tenant Size The smaller the better Current Max size (150 GB) Zero to minimal cross-database requirements Sharding key typically equates to TenantID Examples: User (very common) Region Company Cost Center of Company
Establishing Tenant Surrogate Key Used for all non-reference table records Used by almost all queries (indexes) Ideal Key Small Fixed size Large domain GUID is common data type Too large Cause of severe fragmentation Painful sharding boundary values Painful reference when troubleshooting BigInt is better Half-size of GUID