Task Scheduling for Highly Concurrent Analytical and Transactional Main-Memory Workloads Iraklis Psaroudakis (EPFL), Tobias Scheuer (SAP AG), Norman May (SAP AG), Anastasia Ailamaki (EPFL) 1
Scheduling for high concurrency 2 Queries >> H/W contexts Limited number of H/W contexts How should the DBMS use available CPU resources?
Scheduling for mixed workloads 3 OLTPOLAP Short-lived Reads & updates Scan-light Long-running Read-only Scan-heavy How to schedule highly concurrent mixed workloads? Single thread Parallelism Contention in highly concurrent situations
Scheduling tactics OS scheduler Time Context switch Cache thrashing Admission control We need to avoid both underutilization and overutilization # Threads Time # H/W contexts Overutilization } underutilization } overutilization Coarse granularity of control
Task scheduling A task can contain any code 5 run() {... } One worker thread per core processing tasks Socket 1Socket 2 Task queues Provides a solution to efficiently utilize CPU resources OLAP queries can parallelize w/o overutilization Distributed queues to minimize sync contention Task stealing to fix imbalance Distributed queues to minimize sync contention Task stealing to fix imbalance
Task scheduling problems for DBMS OLTP tasks can block –Problem: under-utilization of CPU resources –Solution: flexible concurrency level OLAP queries can issue an excessive number of tasks in highly concurrent situations –Problem: unnecessary scheduling overhead –Solution: concurrency hint 6
Outline Introduction Flexible concurrency level Concurrency hint Experimental evaluation with SAP HANA Conclusions 7
Fixed concurrency level 8 A fixed concurrency level is not suitable for DBMS Typical task scheduling: Bypasses the OS scheduler OLTP tasks may block Underutilization Fixed
Flexible concurrency level Issue additional workers when tasks block Cooperate with the OS scheduler 9 Concurrency level = # of worker threads Active Concurrency level = # of active worker threads OS Active concurrency level = # H/W contexts The OS schedules the threads
Task Scheduler Worker states 10 Inactive Workers Watchdog: –Monitoring, statistics, and takes actions –Keeps active concurrency level ≈ # of H/W contexts Blocked in syscall Inactive by user Waiting for a task Parked workers Active workers Watchdog Other threads We dynamically re-adjust the scheduler's concurrency level
Outline Introduction Flexible concurrency level Concurrency hint Experimental evaluation with SAP HANA Conclusions 11
Partitionable operations Can be split in a variable number of tasks 12 Calculates its task granularity Σ Σ 1 ≤ # tasks ≤ # H/W contexts Problem: calculation independent of the system’s concurrency situation High concurrency: excessive number of tasks Unnecessary scheduling overhead We should restrict task granularity under high concurrency Partition 1 Partition 2 Partition 3 Final result
Restricting task granularity Existing frameworks for data parallelism –Not straightforward for a commercial DBMS –Simpler way? 13 free worker threads = max(0, # of H/W contexts - # active worker threads) The concurrency hint serves as an upper bound for # tasks concurrency hint = exponential moving average of free worker threads
High latency Low scheduling overhead Higher throughput High latency Low scheduling overhead Higher throughput Concurrency hint 14 Low concurrency situations Lightweight way to restrict task granularity under high concurrency High concurrency situations Concurrency hint # H/W contexts Concurrency hint 1 Σ Σ Σ Σ Σ Σ Σ Σ Σ Σ Low latency
Outline Introduction Flexible concurrency level Concurrency hint Experimental evaluation with SAP HANA Conclusions 15
Experimental evaluation with SAP HANA TPC-H SF=10 TPC-H SF=10 + TPC-C WH=200 Configuration: –8x10 Intel Xeon E GHz, with hyperthreading, 1 TB RAM, 64-bit SMP Linux (SuSE) kernel –Several iterations. No caching. No thinking times. We compare: –Fixed (fixed concurrency level) –Flexible (flexible concurrency level) –Hint (flexible concurrency level + concurrency hint) 16
TPC-H – Response time 17 Task granularity can affect OLAP performance by 11%
TPC-H - Measurements 18 Unnecessary overhead by too many tasks under high concurrency
TPC-H - Timelines 19 Fixed Hint
TPC-H and TPC-C 20 Throughput experiment –Variable TPC-H clients = TPC-C clients = 200.
Conclusions Task scheduling for –Resources management For DBMS –Handle tasks that block Solution: flexible concurrency level –Correlate task granularity of analytical queries with concurrency to avoid unnecessary scheduling overhead Solution: concurrency hint 21 Thank you! Questions?
Outline Introduction Flexible concurrency level Concurrency hint Experimental evaluation with SAP HANA Conclusions 22
TPC-H and TPC-C - Measurements 23
Metadata Manager Authorization Transaction Manager SAP HANA’s architecture 24 Persistence Layer (Logging, Recovery, Page Management) Row-Store Column- Store Graph Engine Text Engine Scheduler Execution Engine Executor Dispatcher Optimizer and Plan Generator Calculation Engine Various access interfaces (SQL, SQL Script, etc.) Metadata Manager Authorization Transaction Manager Connection and Session Management Receivers Network
Our task scheduler 25 Root Node Non-root Nodes Uninitiated graphs Task queues Worker threads Task graph Distributed queues to minimize sync contention Task stealing for load-balancing Distributed queues to minimize sync contention Task stealing for load-balancing