Presentation is loading. Please wait.

Presentation is loading. Please wait.

Internet-Scale Research at Universities Panel Session SAHARA Retreat, Jan 2002 Prof. Randy H. Katz, Bhaskaran Raman, Z. Morley Mao, Yan Chen.

Similar presentations


Presentation on theme: "Internet-Scale Research at Universities Panel Session SAHARA Retreat, Jan 2002 Prof. Randy H. Katz, Bhaskaran Raman, Z. Morley Mao, Yan Chen."— Presentation transcript:

1 Internet-Scale Research at Universities Panel Session SAHARA Retreat, Jan 2002 Prof. Randy H. Katz, Bhaskaran Raman, Z. Morley Mao, Yan Chen

2 Problem Statement Overlay network for service composition Want to study recovery algorithms Lots of client sessions Methodology for evaluation of design? –Simulation? Slow, does not scale with #nodes, #client sessions Does not bring out processing bottlenecks –Real testbed? Cannot be large; setup and management problems Non-repeatable, not good for controlled design study Internet Service cluster: compute cluster capable of running services Peering: exchange perf. info. Destination Source

3 Our approach so far… Emulation platform –Real implementation of software, but emulation of n/w parameters –Inspired by NistNET –Developed our own user-level implementation Gave us better control –Runs on the Millennium cluster of workstations –Central bottleneck: 20,000 pkts/sec App Lib Node 1 Node 2 Node 3 Node 4 Rule for 1  2 Rule for 1  3 Rule for 3  4 Rule for 4  3 Emulator

4 Parameters modeled Overlay topology: –Generate 6,510-node physical network using GT-ITM –Choose subset of nodes for overlay network Latency modeling: –Base latency according to edge weight –Variation in accordance with: RTT spikes are isolated Outage period: –Using traces –Collected UDP-based measurements across 12 host pairs –Berkeley, Stanford, UNSW (Australia), UIUC, TU-Berlin (Germany), CMU –CDF of outage periods, used to model outage periods

5 My experience in Internet measurement Goal –collect client-Local DNS server associations –to evaluate DNS-based server selection Built a measurement infrastructure Three components –1x1 pixel embedded transparent GIF image http://xxx.rd.example.com/tr.gif –A specialized authoritative DNS server Allows hostnames to be wild-carded –An HTTP redirector Always responds with “302 Moved Temporarily” Redirect to a URL with client IP address embedded

6 My experience in Internet measurement Client [10.0.0.1] Redirector for xxx.rd.example.com Local DNS server Content server for the image Name server for *.cs.example.com 1. HTTP GET request for the image 2. HTTP redirect to IP10-0-0-1.cs.example.com 3. Request to resolve IP10-0-0-1.cs.example.com 4. Request to resolve IP10-0-0-1.cs.example.com 5. Reply: IP address of content server 6. Reply: content server IP address 7. HTTP GET request for the image 8. HTTP response

7 My lessons Common myths about Internet measurements –Measurements done from University sites are representative of the Internet –The following are good proximity metrics: AS hop count Router hop count –I can just quote some measurement results from previous papers W/o carefully considering its applicability A scalable measurement methodology helps ease of adoption

8  Dynamic clustering for efficient Web contents replication  Use greedy algorithm for replica placement to reduce the response latency of end users  Trace-driven simulation to find optimal granularity of replication  Network Topology:  Pure-random & transit-Stub models from GT-ITM  A real AS-level topology from 7 widely-dispersed BGP peers  Real world traces: -- Cluster MSNBC Web clients with BGP prefix - BGP tables from a BBNPlanet router on 01/24/2001 - 10K clusters left, chooses top 10% covering >70% of requests -- Cluster NASA Web clients with domain names Content Distribution Network (CDN) Web SitePeriodDurationTotal RequestsRequests/day MSNBC8-10/199910–11am10,284,7351,469,248 (1 hr) NASA7/1995All day3,461,61256,748 WorldCup5-7/1998All day1,352,804,10715,372,774

9 Wide-area Network Distance Estimation Problem formulation: Given N end hosts that belong to different administrative domains, how to select a subset of them to be probes and build an overlay distance estimation service without knowing the underlying topology? Solution: Internet Iso-bar –Cluster of hosts that perceive similar performance to Internet & select a monitor for each cluster for active and continuous probing –Clustering with congestion/path outage correlation –Evaluate the prediction accuracy and stability Evaluation Methodology (I) –NLANR AMP data set 119 sites on US (106 after filtering out most off sites) Traceroute between every pair of hosts every minute Clustering uses daily geometric mean of round-trip time (RTT) Raw data: 6/24/00 – 12/3/01

10 Evaluation Methodology (II) Keynote Website Perspective benchmarking –Measure Web site performance from more than 100 agents –Heterogeneous core network: various ISPs –Heterogeneous access network: Dial up 56K, DSL and high-bandwidth business connections –Agents locations America (including Canada, Mexico): 67 agents in 29 cities from 15 ISPs Europe: 25 agents in 12 cities from 16 ISPs Asia: 8 agents in 6 cities from 8 ISPs Australia: 3 agents in 3 cities from 3 ISPs –40 most popular Web servers for benchmarking Side problem: how to reduce the number of agents and/or servers, but still represent the majority of end-user performance for reasonable long period?

11 Discussion: Difficulties of Internet measurement Results vary greatly depending on your measurement methodology –The number and identity of sites you measure Commercial vs. educational sites –Your measurement location Well-connected site vs. dialup site Backbone vs. access network, server vs. client –Time when measurement is taken Time of day, day of year Transient effects –E.g., Network congestion, flash crowd –Frequency of measurements (for correlation studies) –Intrusiveness of the measurement Does the measurement affect what you are measuring

12 Discussion: Issues with Emulation Emulation platform: modeling correlations in n/w behavior –What happens in one part of the Internet may have non- zero correlation with behavior of another part Scale of topology –We have O(100) machines in department –O(1500) machines on campus –Is this believable?


Download ppt "Internet-Scale Research at Universities Panel Session SAHARA Retreat, Jan 2002 Prof. Randy H. Katz, Bhaskaran Raman, Z. Morley Mao, Yan Chen."

Similar presentations


Ads by Google