Operational Analysis L. Grewe. Operational Analysis Relationships that do not require any assumptions about the distribution of service times or inter-arrival.

Slides:



Advertisements
Similar presentations
CP — Concurrent Programming 9. Condition Objects Prof. O. Nierstrasz Wintersemester 2005 / 2006.
Advertisements

CSC Multiprocessor Programming, Spring, 2011 Outline for Chapter 5 – Building Blocks – Library Classes, Dr. Dale E. Parson, week 5.
Network Applications: High-Performance Network Servers Y. Richard Yang 10/01/2013.
CSCI-1680 Network Programming II Rodrigo Fonseca.
CS533 Concepts of Operating Systems Jonathan Walpole.
Threads, Events, and Scheduling Andy Wang COP 5611 Advanced Operating Systems.
SEDA: An Architecture for Well- Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University.
7. Liveness and Asynchrony Prof. O. Nierstrasz. Roadmap  Asynchronous invocations  Simple Relays —Direct invocations —Thread-based messages —Command-based.
Capriccio: Scalable Threads for Internet Services ( by Behren, Condit, Zhou, Necula, Brewer ) Presented by Alex Sherman and Sarita Bafna.
Precept 3 COS 461. Concurrency is Useful Multi Processor/Core Multiple Inputs Don’t wait on slow devices.
Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, George Necula and Eric Brewer University of California at Berkeley.
I/O Hardware n Incredible variety of I/O devices n Common concepts: – Port – connection point to the computer – Bus (daisy chain or shared direct access)
Java 5 Threading CSE301 University of Sunderland Harry Erwin, PhD.
1 I/O Management in Representative Operating Systems.
Operational Laws A large number of day-to-day problem in computer systems performance analysis can be solved by using some simple relationships that do.
Why Threads Are A Bad Idea (for most purposes) John Ousterhout Sun Microsystems Laboratories
Why Events Are A Bad Idea (for high-concurrency servers) Rob von Behren, Jeremy Condit and Eric Brewer University of California at Berkeley
SEDA: An Architecture for Well-Conditioned, Scalable Internet Services
CSCI-1680 Network Programming II Rodrigo Fonseca.
CS433/533 Computer Networks Lecture 12 Load Balancing Networks 10/8/
Network Applications: Async Servers and Operational Analysis Y. Richard Yang 10/03/2013.
SEDA: An Architecture for Well-Conditioned, Scalable Internet Services
Flash An efficient and portable Web server. Today’s paper, FLASH Quite old (1999) Reading old papers gives us lessons We can see which solution among.
AN INTRODUCTION TO THE OPERATIONAL ANALYSIS OF QUEUING NETWORK MODELS Peter J. Denning, Jeffrey P. Buzen, The Operational Analysis of Queueing Network.
Servers: Concurrency and Performance Jeff Chase Duke University.
M EAN -V ALUE A NALYSIS Manijeh Keshtgary O VERVIEW Analysis of Open Queueing Networks Mean-Value Analysis 2.
CPE 619 Queueing Networks Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama in Huntsville.
Software Performance Testing Based on Workload Characterization Elaine Weyuker Alberto Avritzer Joe Kondek Danielle Liu AT&T Labs.
Chapter 3 System Performance and Models. 2 Systems and Models The concept of modeling in the study of the dynamic behavior of simple system is be able.
1 Multiprocessor and Real-Time Scheduling Chapter 10 Real-Time scheduling will be covered in SYSC3303.
Scalable Internet Services Cluster Lessons and Architecture Design for Scalable Services BR01, TACC, Porcupine, SEDA and Capriccio.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Computer Systems Principles Concurrency Patterns Emery Berger and Mark Corner University.
1 (Worker Queues) cs What is a Thread Pool? A collection of threads that are created once (e.g. when a server starts) That is, no need to create.
Mid Term review CSC345.
Server to Server Communication Redis as an enabler Orion Free
Chapter 4 – Threads (Pgs 153 – 174). Threads  A "Basic Unit of CPU Utilization"  A technique that assists in performing parallel computation by setting.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems with Multi-programming Chapter 4.
CSC Multiprocessor Programming, Spring, 2012 Chapter 11 – Performance and Scalability Dr. Dale E. Parson, week 12.
Processes, Threads, and Process States. Programs and Processes  Program: an executable file (before/after compilation)  Process: an instance of a program.
Threads versus Events CSE451 Andrew Whitaker. This Class Threads vs. events is an ongoing debate  So, neat-and-tidy answers aren’t necessarily available.
What is an Operating System? Various systems and their pros and cons –E.g. multi-tasking vs. Batch OS definitions –Resource allocator –Control program.
Code Development for High Performance Servers Topics Multithreaded Servers Event Driven Servers Example - Game Server code (Quake) A parallelization exercise.
QUEUING THEORY 1.  - means the number of arrivals per second   - service rate of a device  T - mean service time for each arrival   = ( ) Utilization,
Queuing Theory Simulation & Modeling.
Network Applications: High-performance Server Design: Async Servers/Operational Analysis Y. Richard Yang 2/24/2016.
Network Applications: High-performance Server Design Y. Richard Yang 2/17/2016.
Java.util.concurrent package. concurrency utilities packages provide a powerful, extensible framework of high-performance threading utilities such as.
Introduction to Operating Systems Concepts
Thread Pools (Worker Queues) cs
CPE 619 Mean-Value Analysis
Advanced Operating Systems CIS 720
Threads vs. Events SEDA – An Event Model 5204 – Operating Systems.
Advanced Topics in Distributed and Reactive Programming
Thread Pools (Worker Queues) cs
Advanced Topics in Concurrency and Reactive Programming: Asynchronous Programming Majeed Kassis.
Network Applications: Operational Analysis; Load Balancing among Homogeneous Servers Y. Richard Yang 10/10/2017.
Threads, Events, and Scheduling
Network Applications: High-performance Server Design & Operational Analysis Y. Richard Yang 10/5/2017.
Capriccio – A Thread Model
Advanced Topics in Distributed and Reactive Programming
Threads, Events, and Scheduling
Capriccio: Scalable Threads for Internet Services
Threads, Events, and Scheduling
Why Threads Are A Bad Idea (for most purposes)
Network Applications: High-Performance Server Design (Thread, Async)
Y. Richard Yang 10/9/2018 Network Applications: High-Performance Server Design (Async Select NonBlocking Servers)
Why Threads Are A Bad Idea (for most purposes)
Advanced Topics in Functional and Reactive Programming
Why Threads Are A Bad Idea (for most purposes)
Mr. M. D. Jamadar Assistant Professor
Presentation transcript:

Operational Analysis L. Grewe

Operational Analysis Relationships that do not require any assumptions about the distribution of service times or inter-arrival times. Identified originally by Buzen (1976) and later extended by Denning and Buzen (1978). We touch only some techniques/results – In particular, bottleneck Analysis More details see linked reading 2

Under the Hood (An example FSM) CPU File I/O I/O request start (arrival rate λ) exit (throughput λ until some center saturates) Memory cache network

Operational Analysis: Resource Demand of a Request 4 CPU Disk Network V CPU visits for S CPU units of resource time per visit V Net visits for S Net units of resource time per visit V Disk visits for S Disk units of resource time per visit Memory V Mem visits for S Mem units of resource time per visit

Operational Quantities T: observation interval Ai: # arrivals to device i Bi: busy time of device i Ci: # completions at device i i = 0 denotes system 5

Utilization Law The law is independent of any assumption on arrival/service process Example: Suppose processes 125 pks/sec, and each pkt takes 2 ms. What is utilization? 6

Forced Flow Law Assume each request visits device i Vi times 7

Bottleneck Device Define Di = Vi Si as the total demand of a request on device i The device with the highest Di has the highest utilization, and thus is called the bottleneck 8

Bottleneck vs System Throughput 9

Example 1 A request may need – 10 ms CPU execution time – 1 Mbytes network bw – 1 Mbytes file access where 50% hit in memory cache Suppose network bw is 100 Mbps, disk I/O rate is 1 ms per 8 Kbytes (assuming the program reads 8 KB each time) Where is the bottleneck? 10

Example 1 (cont.) CPU: – D CPU = Network: – D Net = Disk I/O: – Ddisk = 11 Disk I/O and network are more likely to be bottlenecks; single CPU thread can be enough 10 ms ( e.q. 100 requests/s) 1 Mbytes / 100 Mbps = 80 ms (e.q., 12.5 requests/s) 0.5 * 1 ms * 1M/8K = 62.5 ms (e.q. = 16 requests/s)

Example 1 (cont.) Suppose arrival/process rate is 12 requests per second, what is the response time from the disk – Utilization of disk = 12*0.5*125*1ms= 75% – Using M/M/1 (not operational law): Response time per request block: = 1 ms /(1-0.75) = 4 ms – If not cached, request 125 disk blocks: = 4 ms * 125 = 500 ms – There is another way to derive R = S/(1-U) 12

13 Background: Little’s Law (1961) For any system with no or (low) loss. Assume – mean arrival rate, mean time at device R, and mean number of requests at device Q Then relationship between Q,, and R: R, Q Example: Yale College admits 1500 students each year, and mean time a student stays is 4 years, how many students are enrolled?

Little’s Law 14 time arrival A t

Deriving Relationship Between R, U, and S Assume flow balanced (arrival=throughput) Assume PASTA (Poisson arrival--memory-less arrival--sees time average), a new request sees Q ahead of it Assume FIFO According to utilization law, U = XS 15

Example 2 A request may need – 150 ms CPU execution time (e.g., dynamic content) – 1 Mbytes network bw – 1 Mbytes file access where 50% hit in memory cache Suppose network bw is 100 Mbps, disk I/O rate is 1 ms per 8 Kbytes (assuming the program reads 8 KB each time) Implication: multiple threads to use more CPUs, if available, to avoid CPU as bottleneck 16

Interactive Response Time Law System setup – Closed system with N users – Each user sends in a request, after response, think time, and then send next request – Notation Z = user think-time, R = Response time – The total cycle time of a user request is R+Z 17 In duration T, #requests generated by each user: T/(R+Z) requests

Interactive Response Time Law If N users and flow balanced: 18

Bottleneck Analysis Here D is the sum of Di 19

Proof We know 20 Using interactive response time law:

In Practice: Common Bottlenecks No more File Descriptors Sockets stuck in TIME_WAIT High Memory Use (swapping) CPU Overload Interrupt (IRQ) Overload [Aaron Bannert]

Summary: Story So Far Avoid blocking (so that we can reach bottleneck throughput) – Introduce threads Limit unlimited thread overhead – Thread pool, async io Coordinating data access – synchronization (lock, synchronized) Coordinating behavior: avoid busy-wait – Wait/notify; FSM Extensibility/robustness – Language support/Design for interfaces System modeling – Queueing analysis, operational analysis 22

Summary: Architecture Architectures – Multi threads – Asynchronous – Hybrid Assigned reading: SEDA 23

Beyond Class: Complete Java Concurrency Framework Executors — Executor — ExecutorService — ScheduledExecutorService — Callable — Future — ScheduledFuture — Delayed — CompletionService — ThreadPoolExecutor — ScheduledThreadPoolExecutor — AbstractExecutorService — Executors — FutureTask — ExecutorCompletionService Queues — BlockingQueue — ConcurrentLinkedQueue — LinkedBlockingQueue — ArrayBlockingQueue — SynchronousQueue — PriorityBlockingQueue — DelayQueue 24 Concurrent Collections — ConcurrentMap — ConcurrentHashMap — CopyOnWriteArray{List,Set} Synchronizers — CountDownLatch — Semaphore — Exchanger — CyclicBarrier Locks: java.util.concurrent.locks — Lock — Condition — ReadWriteLock — AbstractQueuedSynchronizer — LockSupport — ReentrantLock — ReentrantReadWriteLock Atomics: java.util.concurrent.atomic — Atomic[Type] — Atomic[Type]Array — Atomic[Type]FieldUpdater — Atomic{Markable,Stampable}Reference See jcf slides for a tutorial.

Beyond Class: Design Patterns We have seen Java as an example C++ and C# can be quite similar. For C++ and general design patterns: – tutorial4.pdf – 25

Backup Slides 26

Asynchronous Multi-Process Event Driven (AMPED) Like Single PED, but use helper processes/threads for – disk I/O (avoid unnecessary blocking) or – CPU bottleneck (when D CPU becomes bottleneck) Accept Conn Read Request Find File Send Header Read File Send Data Event Dispatcher Helper 1

Should You Abandon Threads? No: important for high-end servers (e.g. databases). But, avoid threads wherever possible: – Use events, not threads, for GUIs, distributed systems, low-end servers. – Only use threads where true CPU concurrency is needed. – Where threads needed, isolate usage in threaded application kernel: keep most of code single-threaded. Threaded Kernel Event-Driven Handlers [Ousterhout 1995]

Another view Events obscure control flow – For programmers and tools ThreadsEvents thread_main(int sock) { struct session s; accept_conn(sock, &s); read_request(&s); pin_cache(&s); write_response(&s); unpin(&s); } pin_cache(struct session *s) { pin(&s); if( !in_cache(&s) ) read_file(&s); } AcceptHandler(event e) { struct session *s = new_session(e); RequestHandler.enqueue(s); } RequestHandler(struct session *s) { …; CacheHandler.enqueue(s); } CacheHandler(struct session *s) { pin(s); if( !in_cache(s) ) ReadFileHandler.enqueue(s); else ResponseHandler.enqueue(s); }... ExitHandlerr(struct session *s) { …; unpin(&s); free_session(s); } Accept Conn. Write Response Read File Read Request Pin Cache Web Server Exit [von Behren]

Control Flow Accept Conn. Write Response Read File Read Request Pin Cache Web Server Exit ThreadsEvents thread_main(int sock) { struct session s; accept_conn(sock, &s); read_request(&s); pin_cache(&s); write_response(&s); unpin(&s); } pin_cache(struct session *s) { pin(&s); if( !in_cache(&s) ) read_file(&s); } CacheHandler(struct session *s) { pin(s); if( !in_cache(s) ) ReadFileHandler.enqueue(s); else ResponseHandler.enqueue(s); } RequestHandler(struct session *s) { …; CacheHandler.enqueue(s); }... ExitHandlerr(struct session *s) { …; unpin(&s); free_session(s); } AcceptHandler(event e) { struct session *s = new_session(e); RequestHandler.enqueue(s); } Events obscure control flow – For programmers and tools [von Behren]

Exceptions Exceptions complicate control flow – Harder to understand program flow – Cause bugs in cleanup code Accept Conn. Write Response Read File Read Request Pin Cache Web Server Exit ThreadsEvents thread_main(int sock) { struct session s; accept_conn(sock, &s); if( !read_request(&s) ) return; pin_cache(&s); write_response(&s); unpin(&s); } pin_cache(struct session *s) { pin(&s); if( !in_cache(&s) ) read_file(&s); } CacheHandler(struct session *s) { pin(s); if( !in_cache(s) ) ReadFileHandler.enqueue(s); else ResponseHandler.enqueue(s); } RequestHandler(struct session *s) { …; if( error ) return; CacheHandler.enqueue(s); }... ExitHandlerr(struct session *s) { …; unpin(&s); free_session(s); } AcceptHandler(event e) { struct session *s = new_session(e); RequestHandler.enqueue(s); } [von Behren]

State Management ThreadsEvents thread_main(int sock) { struct session s; accept_conn(sock, &s); if( !read_request(&s) ) return; pin_cache(&s); write_response(&s); unpin(&s); } pin_cache(struct session *s) { pin(&s); if( !in_cache(&s) ) read_file(&s); } CacheHandler(struct session *s) { pin(s); if( !in_cache(s) ) ReadFileHandler.enqueue(s); else ResponseHandler.enqueue(s); } RequestHandler(struct session *s) { …; if( error ) return; CacheHandler.enqueue(s); }... ExitHandlerr(struct session *s) { …; unpin(&s); free_session(s); } AcceptHandler(event e) { struct session *s = new_session(e); RequestHandler.enqueue(s); } Accept Conn. Write Response Read File Read Request Pin Cache Web Server Exit Events require manual state management Hard to know when to free – Use GC or risk bugs [von Behren]