An Approach to Measuring Large-Scale Distributed Systems Jun Li, Peter Reiher, Gerald Popek, and Mark Yarvis UCLA Geoffrey H. Kuenning Harvey Mudd College.

An Approach to Measuring Large-Scale Distributed Systems Jun Li, Peter Reiher, Gerald Popek, and Mark Yarvis UCLA Geoffrey H. Kuenning Harvey Mudd College

2 How to Measure Internet-Scale Systems? ä Distributed systems have complex performance at large sizes ä Would like to measure & tune before deployment ä Biggest research testbeds are tiny relative to Internet ä Only Internet-scale testbed is Internet itself

3 Live Internet Measurement ä Difficult or impossible to get cooperation ä Difficult to control remote sites ä Extraneous noise in measurements

4 The Simulation Option ä Usually requires models of real software ä Expensive to develop ä Possible inaccuracy or bugs ä Must be validated against real system ä Simulation usually much slower than reality

5 Measuring Big Distributed Systems is Tough ä Only one really big testbed: the Internet ä Can’t get enough participants ä Too much noise for repeatable measurements ä Simulations don’t use the real software ä Hard to validate ä Small testbeds don’t reveal scaling problems

6 Testbed Overloading ä Use real software ä Run multiple instances on one machine ä Virtual topology to simulate connectivity

7 Characteristics of Overloading ä Allows greatly increased scale ä Works best when applications are lightweight ä Some (not all) measurements will differ

8 Effects of Overloading ä Some metrics unaffected ä Hop count ä Bytes transferred per (virtual) node ä Storage cost ä Other metrics must be adjusted due to resource competition ä CPU processing times ä Latencies

9 Eliminating Interference ä Locking to avoid contention ä Characterize slowdown ä Divide and conquer

10 Locking to Avoid Contention ä Use central coordinator ä One process at a time initiates operation x ä Measure latency, bytes transferred, messages exchanged ä No contention because of serialization ä Works well for operations that are one-at-a- time in real world (e.g., join multicast group) ä Total run time increases

11 Slowdown Analysis ä Measure time for one logical node on a physical node ä Measure time for n logical nodes ä Develop slowdown factor as function of n ä Apply to measured results

12 Divide and Conquer ä Divide task into components ä Must be independent ä No parallelism ä Contention only at component boundaries ä Measure components individually in isolation ä Measure occurrences in full system & sum ä Resource contention now omitted from total

13 Divide-and-Conquer Example ä Components of dissemination latency in Revere ä Local processing time ä Kernel-space crossing ä Transmission delay (per hop) ä Each component measured in isolation ä Sum multiplied by observed hop count

14 Dissemination Latency OS Revere Previous hop Next hop Java Local processing time (measured) Kernel-crossing time (measured) Per-hop transmission latency (parameter)

15 OS Java Revere User space Kernel space Java OS Revere Java Measurement Environment Delays - Sum known times - Multiply by hop count

16 Open Issues ä Measurement framework for arbitrary applications ä Scalability of locking approach

17 Conclusions ä Method for measuring much larger systems ä Used to measure Revere on 3000 virtual nodes ä Avoids drawbacks of other approaches

An Approach to Measuring Large-Scale Distributed Systems Jun Li, Peter Reiher, Gerald Popek, and Mark Yarvis UCLA Geoffrey H. Kuenning Harvey Mudd College lijun@lasr.cs.ucla.edu geoff@cs.hmc.edu http://lasr.cs.ucla.edu/revere

Black Slide

An Approach to Measuring Large-Scale Distributed Systems Jun Li, Peter Reiher, Gerald Popek, and Mark Yarvis UCLA Geoffrey H. Kuenning Harvey Mudd College.

Similar presentations

Presentation on theme: "An Approach to Measuring Large-Scale Distributed Systems Jun Li, Peter Reiher, Gerald Popek, and Mark Yarvis UCLA Geoffrey H. Kuenning Harvey Mudd College."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

An Approach to Measuring Large-Scale Distributed Systems Jun Li, Peter Reiher, Gerald Popek, and Mark Yarvis UCLA Geoffrey H. Kuenning Harvey Mudd College.

Similar presentations

Presentation on theme: "An Approach to Measuring Large-Scale Distributed Systems Jun Li, Peter Reiher, Gerald Popek, and Mark Yarvis UCLA Geoffrey H. Kuenning Harvey Mudd College."— Presentation transcript:

Similar presentations

About project

Feedback