Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mladen Prajdić SQL Server MVP Hekaton The New SQL Server In-Memory OLTP Engine.

Similar presentations


Presentation on theme: "Mladen Prajdić SQL Server MVP Hekaton The New SQL Server In-Memory OLTP Engine."— Presentation transcript:

1 Mladen Prajdić SQL Server MVP mladenp@gmail.com Hekaton The New SQL Server In-Memory OLTP Engine

2 2 About me

3 3 Agenda Why we need it? What is it? Storage engine Relational engine

4 4 Why we need it? Computing power holds Moore Law due to parallelism CPU clock frequency stalled Memory has gotten a LOT cheaper

5 5 What is it? Hekaton = Greek for 100x In-Memory OLTP Built directly into DB engine High performance Fully ACID

6 6 What is it?

7 7 Rough OLTP workload distribution Relational Engine Storage Engine

8 8 How do we start? Special MEMORY_OPTIMIZED_DATA filegroup All Hekaton stuff goes in there The filegroup can’t be removed after creation for now But you can removes all objects and make it empty Only allowed Windows (non-SQL) BIN2 collations 129 collations SELECT * FROM fn_helpcollations() WHERE name like '%BIN2%' and name not like '%SQL%'

9 9 Storage Engine Tables Indexes Transaction Log Checkpoints

10 10 Tables MEMORY_OPTIMIZED = ON clause No LOB or off-row data allowed 8060 byte limit enforced at creation time Needs at least 1 index Data DURABILITY SCHEMA_AND_DATA (default) Must have PK SCHEMA_ONLY Ludicrous speed

11 11 Tables On CREATE TABLE Schema compiled to C DLL and written to DB metadata DLL loaded to memory All table index and row access done through DLL

12 12 Tables - Allowed data types bit tinyint, smallint, int, bigint money, smallmoney float, real date/time types: datetime, smalldatetime, datetime2, date, time numeric and decimal types char(n), varchar(n), nchar(n), nvarchar(n) binary(n), varbinary(n) uniqueidentifier

13 13 Tables - Limitations No DML triggers No FOREIGN KEY or CHECK constraints No IDENTITY columns No UNIQUE indexes other than for the PRIMARY KEY A maximum of 8 indexes, including the index supporting the PRIMARY KEY NO SCHEMA CHANGES For table changes you’ll have to DROP and RECREATE Indexes are created at table creation time, can’t be changed

14 14 Rows TxID global DB counter Reset on restart Incremented on TX start TxTimeStamp global DB counter Not reset on restart Incremented on TX end Row Header BeginTS TX TS that created the row EndTS TX TS that deleted the row Infinity StmtID Statement that created the row IdxLinkCount Number of indexes that reference this row Payload = actual data Row Header Begin TimeStamp (8 bytes)End TimeStamp (8 bytes)Statement ID (4 bytes)Index Link Count (2 bytes) 8 bytes * (Number of indexes)

15 15 Indexes Hash Indexes Range Indexes Built on MVCC method Table needs at least one index They connect the rows together No pages or extents as we know them now Never written to disk, always only in memory Recreated on database recovery Recovery is parallelized

16 16 Indexes - Hash Indexes 1D Array of pointers Collection of hashed values of key columns Bucket count (array length) defined at table creation Can’t be changed Rounds to the nearest higher power of 2 1000 -> 1024 (8KB), 500000 -> 524288 (4MB) Trade off between average row chain size and available memory Only useful for equality and non-equality checks WHERE col1 = ‘value’ uses index WHERE col1 like ‘val%’ won’t use index Hash collisions Multiple column values hash to same integer value Forward only linked list

17 17 Indexes - Hash Indexes … 5 6 7 … Hash Index on Name TimeStampsIndex ptrName30, ∞nullMatija1 Insert First Row To Table: Hash Matija = 5 5,6,7 are hashed values

18 18 Indexes - Hash Indexes … 5 6 7 … Hash Index on Name TimeStampsIndex ptrName30, ∞nullMatija150, ∞1Mladen12 Insert Second Row To Table: Hash Mladen = 5 5,6,7 are hashed values

19 19 Indexes - Hash Indexes … 5 6 7 … Hash Index on Name TimeStampsIndex ptrName150, 2501Mladen250, ∞2Mladen22.130, ∞nullMatija1 Update Second Row: Change non Name data for Mladen 5,6,7 are hashed values

20 20 Indexes - Range Indexes When having no clue about number of buckets Searches based on data range Bw-tree

21 21 Indexes - Range Indexes: Bw-tree Discovered by Microsoft Research in 2011 Lock and Latch free version of B-tree Index pages are not fixed size (max size still 8k) Can’t be changed Page pointer (PID) Location in the Page Mapping Table (1D array of page locations) For disk based table it’s a physical location

22 22 Indexes - Range Indexes: Bw-tree Delta pages Because existing pages can’t be changed Three page reorganization options Consolidation of delta records Delta chain > 16: Page rebuilt Old pages (original + delta) marked for garbage collection Splitting of a full index page When page has no more space (original+delta) to add new rows Merging of adjacent index pages Page has 1 row or less than 10% max size Merge with neighbor page if it has enough space (new page created) In the end always same PID but different memory address

23 23 Indexes - Range Indexes: Pages 0 1 2 3 4 5 6 7 8 9 10 … Page Mapping Table 82031 Index Root PID 0 358 PID 6 121520 PID 2 242831 PID 8 Non-Leaf Pages 123 PID 10 678 PID 7 293031 PID 4 Leaf Pages 30, ∞null120, 45null3 Data rows Key Memory address

24 24 Indexes - Range Indexes: Delta Rows 0 1 2 3 … Page Mapping Table PID 2 Page with existing rows. PID = 2 Original rows

25 25 Indexes - Range Indexes: Delta Rows 0 1 2.1 3 … Page Mapping Table Inserted row PID 2.1 Insert one row to the page Original rows Delta record 1

26 26 Indexes - Range Indexes: Delta Rows 0 1 2.2 3 … Page Mapping Table Deleted row PID 2.2 Delete a row from the page Inserted row Original rows Delta record 2 Delta record 1

27 27 Transaction log Still needed to guarantee full ACID Data written to it only at TX commit In-Memory is optimized for multiple concurrent log streams Optimized writes, much less data written Can still be a bottleneck Use fast drives! Fully skipped for non-durable tables Because gives us ->

28 28 Transaction log DEMO

29 29 Checkpoints Needed to reduce recovery time Same as for disk based tables Multiple Data and Delta files 1-1 mapping Checkpoint inventory file

30 30 Checkpoints - Data Files Contains only inserted versions of rows INSERTs + UPDATEs (MVCC) Append only Each file covers specific timestamp range Max 128MB per file

31 31 Checkpoints - Delta files Info about deleted rows in the Data file Append only Mapping to a single data file

32 32 Checkpoints - Checkpoint Issued T-LOG scanned from previous checkpoint and converted to Data/Delta files Checkpoint Inventory File created Locations of all Data/Delta files Saved to the transaction log Location of the Checkpoint Inventory File saved to transaction log Multiple data-delta files allow for high database recovery parallelization In-Memory recovery happens in parallel with normal disk based recovery

33 33 Checkpoints – Merging Multiple Data/Delta Files Automatic Manual SP: sys.sp_xtp_merge_checkpoint_files DMV: sys.dm_db_xtp_checkpoint_files DMV: sys.dm_db_xtp_merge_requests Only adjacent less than 50% full files can be merged Data filesMerge SelectionAfter Merge D1 (30%), D2 (30%), D3 (30%), D4 (40%)D1, D2, D3D123, D4 D1 (30%), D2 (50%), D3 (50%), D4 (40%)D1, D2 and D3, D4D12, D34 D1 (50%), D2 (70%), D3 (60%), D4 (50%)/D1, D2, D3, D4

34 34 Relational Engine T-SQL Stored procedures Isolation levels

35 35 T-SQL 2 ways to access In-Memory tables T-SQL interop Natively compiled stored procedures T-SQL interop is usually slower than compiled stored procedures But it depends on the operation Complex business logic is faster in compiled procedures

36 36 T-SQL: Not allowed on In-Memory tables TRUNCATE TABLE MERGE (when a memory-optimized table is the target) Cross-database queries and transactions Linked servers Locking hints: TABLOCK, XLOCK, PAGLOCK, etc. NOLOCK is supported, but is quietly ignored Isolation level hints READUNCOMMITTED, READCOMMITTED, READCOMMITTEDLOCK Dynamic and keyset cursors degraded to static cursors

37 37 Stored procedures Only allowed access to In-Memory OLTP tables Compiled to C DLL Atomic block All statements in single TX TX savepoint if in existing TX No parameter sniffing UNKNOWN by default Execution Plan embedded in DLL create procedure dbo.spCompiledProc @param1 int not null, @param2 varchar(100) with native_compilation, schemabinding, execute as owner as begin atomic with ( transaction isolation level=snapshot, language=N'us_english‘) -- stored procedure body end

38 38 Stored procedures: T-SQL support TRY, CATCH, error functions, IF, WHILE Memory-optimized table types and table variables Math, Date, String functions SCOPE_IDENTITY, ISNULL, NEWID, NEWSEQUENTIALID … No DISTINCT, WITH TIES, PERCENT TOP with ORDER BY returns max 8192 rows for single table 4096 if two joined tables, 2730 if three joined tables,... BOL: “Supported Constructs in Natively Compiled Stored Procedures”

39 39 DLL created loaded to memory Processing flow Query Trees T-SQL stored procedure Parser Algebrizer Query optimizer CompilerRuntime Stored procedures: Compilation Processing flow with Optimized Query Plans

40 40 EXEC spDoStuff Parser extracts the name and parameters Runtime locates the sp DLL entry point DLL executes the sp logic Results returned to the client Stored procedures: Execution

41 41 Data operations and Isolation levels SNAPSHOT Read data must be the same as at the start of the transaction REPEATABLE READ Can’t read modified non-committed data from other transactions SERIALIZABLE SNAP+RR+No phantom reads (new row inserts in the queried key range) READ COMMITTED Only for autocommit (single statement) transactions Any violation of the upper rules results in current transaction fail

42 42 Maintenance of DLLs DLL’s saved on the filesystem No need for DBA’s to do anything Automatic removal when object or db is dropped or server restarted SELECTname, description FROMsys.dm_os_loaded_modules WHEREdescription = 'XTP Native DLL'

43 43 Performance comparisons: Transaction/Sec Normal stored procedure and table Normal stored procedure and Hekaton table Hekaton stored procedure and Hekaton table Scales linearly with the number of CPUs

44 44 Performance comparisons: Execution time

45 45 ?


Download ppt "Mladen Prajdić SQL Server MVP Hekaton The New SQL Server In-Memory OLTP Engine."

Similar presentations


Ads by Google