U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Computer Systems Principles Concurrency Patterns Emery Berger and Mark Corner University of Massachusetts Amherst
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 200x100/75.jpg not found client web server Web Server Client (browser) –Requests HTML, images Server –Caches requests –Sends to client
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 3 Possible Implementation while (true) { wait for connection; read from socket & parse URL; look up URL contents in cache; if (!in cache) { fetch from disk / execute CGI; put in cache; } send data to client; }
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 4 Possible Implementation while (true) { wait for connection; // net read from socket & parse URL; // cpu look up URL contents in cache; // cpu if (!in cache) { fetch from disk / execute CGI;//disk put in cache; // cpu } send data to client; // net }
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science clients web server Problem: Concurrency Sequential fine until: –More clients –Bigger server Multicores, multiprocessors Goals: –Hide latency of I/O Don’t keep clients waiting –Improve throughput Serve up more pages
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 6 Building Concurrent Apps Patterns / Architectures –Thread pools –Producer-consumer –“Bag of tasks” –Worker threads (work stealing) Goals: –Minimize latency –Maximize parallelism –Keep progs. simple to program & maintain
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 7 Thread Pools Thread creation relatively expensive Instead: use pool of threads –When new task arrives, get thread from pool to work on it; block if pool empty –Faster with many tasks –Limits max threads (thus resources) –( ThreadPoolExecutor class in Java)
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 8 producer consumer Producer-Consumer Can get pipeline parallelism: –One thread (producer) does work E.g., I/O –and hands it off to other thread (consumer)
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 9 producer consumer Producer-Consumer Can get pipeline parallelism: –One thread (producer) does work E.g., I/O –and hands it off to other thread (consumer)
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 10 producer consumer Producer-Consumer Can get pipeline parallelism: –One thread (producer) does work E.g., I/O –and hands it off to other thread (consumer)
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 11 producer consumer Producer-Consumer Can get pipeline parallelism: –One thread (producer) does work E.g., I/O –and hands it off to other thread (consumer)
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 12 producer consumer LinkedBlockingQueue Blocks on put() if full, poll() if empty Producer-Consumer Can get pipeline parallelism: –One thread (producer) does work E.g., I/O –and hands it off to other thread (consumer)
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 13 while (true) { do something… queue.put (x); } while (true) { x = queue.poll(); do something… } while (true) { wait for connection; read from socket & parse URL; look up URL contents in cache; if (!in cache) { fetch from disk / execute CGI; put in cache; } send data to client; } Producer-Consumer Web Server Use 2 threads: producer & consumer –queue.put(x) and x = queue.poll();
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 14 while (true) { wait for connection; read from socket & parse URL; queue.put (URL); } while (true) { URL = queue.poll(); look up URL contents in cache; if (!in cache) { fetch from disk / execute CGI; put in cache; } send data to client; } Producer-Consumer Web Server Pair of threads – one reads, one writes
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 15 while (true) { wait for connection; read from socket & parse URL; queue1.put (URL); } while (true) { URL = queue1.poll(); look up URL contents in cache; if (!in cache) { queue2.put (URL); return; } send data to client; } while (true) { URL = queue2.poll(); fetch from disk / execute CGI; put in cache; send data to client; } 1 2 Producer-Consumer Web Server More parallelism – optimizes common case (cache hit)
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 16 When to Use Producer-Consumer Works well for pairs of threads –Best if producer & consumer are symmetric Proceed roughly at same rate –Order of operations matters Not as good for –Many threads –Order doesn’t matter –Different rates of progress
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 17 while (true) { wait for connection; read from socket & parse URL; queue1.put (URL); } while (true) { URL = queue1.poll(); look up URL contents in cache; if (!in cache) { queue2.put (URL); } send data to client; } while (true) { URL = queue2.poll(); fetch from disk / execute CGI; put in cache; send data to client; } 1 2 Producer-Consumer Web Server Should balance load across threads
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 18 worker Bag of Tasks Collection of mostly independent tasks
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 19 worker Bag of Tasks Collection of mostly independent tasks
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 20 worker Bag of Tasks Collection of mostly independent tasks
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 21 worker Bag of Tasks Collection of mostly independent tasks
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 22 worker Bag of Tasks Collection of mostly independent tasks
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 23 worker Bag of Tasks Collection of mostly independent tasks
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 24 worker addWork Bag of Tasks Collection of mostly independent tasks Bag could also be LinkedBlockingQueue (put, poll)
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 25 while (true) { wait for connection; read from socket & parse URL; look up URL contents in cache; if (!in cache) { fetch from disk / execute CGI; put in cache; } send data to client; } Exercise: Restructure into BOT Re-structure this into bag of tasks: –addWork & worker threads –t = bag.poll() or bag.put(t)
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 26 addWork: while (true) { wait for connection; t.URL = URL; t.sock = socket; bag.put (t); } Worker: while (true) { t = bag.poll(); look up t.URL contents in cache; if (!in cache) { fetch from disk / execute CGI; put in cache; } send data to client via t.sock; } Exercise: Restructure into BOT Re-structure this into bag of tasks: –addWork & worker –t = bag.poll() or bag.put(t)
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 27 worker addWork: while (true){ wait for connection; bag.put (URL); } worker: while (true) { URL = bag.poll(); look up URL contents in cache; if (!in cache) { fetch from disk / execute CGI; put in cache; } send data to client; } worker addWork Bag of Tasks Web Server Re-structure this into bag of tasks: –t = bag.poll() or bag.put(t)
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 28 Bag of Tasks vs. Prod/Consumer Exploits more parallelism Even with coarse-grained threads –Don’t have to break up tasks too finely What does task size affect? –possibly latency… smaller might be better Easy to change or add new functionality But: one major performance problem…
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 29 worker addWork What’s the Problem?
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 30 worker addWork What’s the Problem? Contention – single lock on structure –Bottleneck to scalability
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 31 executor Work Queues Each thread has own work queue (deque) –No single point of contention Threads now generic “executors” –Tasks (balls): blue = parse, yellow = connect…
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science executor 32 Work Queues Each thread has own work queue (deque) –No single point of contention
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 33 executor Work Queues Each thread has own work queue (deque) –No single point of contention
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science executor 34 Work Queues Each thread has own work queue (deque) –No single point of contention
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 35 executor Work Queues Each thread has own work queue –No single point of contention Now what?
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 36 worker Work Stealing When thread runs out of work, steal work from random other thread
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 37 worker Work Stealing When thread runs out of work, steal work from top of random deque Optimal load balancing algorithm
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 38 while (true) { wait for connection; read from socket & parse URL; look up URL contents in cache; if (!in cache) { fetch from disk / execute CGI; put in cache; } send data to client; } Work Stealing Web Server Re-structure: readURL, lookUp, addToCache, output –myQueue.put(new readURL (url))
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science readURL, lookUp, addToCache, output class Work { public: virtual void run(); }; class readURL : public Work { public: void run() {…} readURL (socket s) { …} }; 39 while (true) { wait for connection; read from socket & parse URL; look up URL contents in cache; if (!in cache) { fetch from disk / execute CGI; put in cache; } send data to client; }
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 40 readURL output lookUp addToCache worker
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science class readURL { public: void run() { read from socket, f = get file myQueue.put (new lookUp(_s, f)); } readURL(socket s) { _s = s; } }; 41
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science class lookUp { public: void run() { look in cache for file “f” if (!found) myQueue.put (new addToCache(_f)); else myQueue.put (new Output(s, cont)); } lookUp (socket s, string f) { _s = s; _f = f; } }; 42
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science class addToCache { public: void run() { fetch file f from disk into cont add file to cache (hashmap) myQueue.put (new Output(s, cont)); } 43
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 44 readURL(url) { wait for connection; read from socket & parse URL; myQueue.put (new lookUp (URL)); } Work Stealing Web Server Re-structure: readURL, lookUp, addToCache, output –myQueue.put(new readURL (url))
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 45 readURL(url) { wait for connection; read from socket & parse URL; myQueue.put (new lookUp (URL)); } lookUp(url) { look up URL contents in cache; if (!in cache) { myQueue.put (new addToCache (URL)); } else { myQueue.put (new output(contents)); } addToCache(URL) { fetch from disk / execute CGI; put in cache; myQueue.put (new output(contents)); } Work Stealing Web Server Re-structure: readURL, lookUp, addToCache, output –myQueue.put(new readURL (url))
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 46 Work Stealing Works great for heterogeneous tasks –Convert addWork and worker into units of work (different colors) Flexible: can easily re-define tasks –Coarse, fine-grained, anything in-between Automatic load balancing Separates thread logic from functionality Popular model for structuring servers
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 47 The End