Progress Report Chia-Lun Wu 2015.6.9
Simulate client/server environment 2 VM (server/client) Python script to simulate client n users -> n threads Call bosc_test binary N users, each client issue M requests with K records
Experiment Settings Parameters BOSC_SCAN_THREESHOLD (10000, 100000) IO_thread max sleep time (60s, 30s) 50 users issue 1000 requests with 10 records Monitor throughput overtime Random + update intensive workload 50% insert, 50% modify
Experiment Results When flushing, throughput degrades Flushing to disk can’t keep up with memory updates After flushing, dirty blocks get more (ex: 1981 -> flush -> 86794) Each user sleep 5s between requests -> dirty blocks get more later End up flushing all the time Need throttling mechanism to limit the rate of memory updates ? Flushing to disk is slow Real-time ingestion of high-frequency small-sized sensor data streams Streaming random disk writes: > 100000 records per second per server
Experiment Results SCAN_THRESHOLD = 10000 SLEEP = 60s No flush (Throughput)
Experiment Results SCAN_THRESHOLD = 10000 SLEEP = 60s Flush
Experiment Results SCAN_THRESHOLD = 100000 SLEEP = 60s No flush
Experiment Results SCAN_THRESHOLD = 100000 SLEEP = 30s No flush
Experiment Results SCAN_THRESHOLD = 10000 SLEEP = 60s 100 Client sleep 5 s every request Experiment Results
Experiment Results SCAN_THRESHOLD = 10000 SLEEP = 60s 100 Client sleep 5 s every request Experiment Results Dirty blocks still get more after flushing Flush Flush Flush Flush Flush
Experiment Results (50 users issue 1000 requests with 10 records) (Average Throughput)