Evaluation of the Proximity between Web Clients and their Local DNS Servers Z. Morley Mao UC Berkeley Chuck Cranor, Fred Douglis,

Slides:



Advertisements
Similar presentations
Topology Modeling via Cluster Graphs Balachander Krishnamurthy and Jia Wang AT&T Labs Research.
Advertisements

Information-Centric Networks05c-1 Week 5 / Paper 3 Democratizing content publication with Coral –Michael J. Freedman, Eric Freudenthal, David Mazières.
Using Mobility Support for Request-Routing in IPv6 CDNs Arup Acharya and Anees Shaikh TJ Watson Research Center Presented by Renu Tewari.
RB-Seeker: Auto-detection of Redirection Botnet Presenter: Yi-Ren Yeh Authors: Xin Hu, Matthew Knysz, Kang G. Shin NDSS 2009 The slides is modified from.
On the Effectiveness of Measurement Reuse for Performance-Based Detouring David Choffnes Fabian Bustamante Fabian Bustamante Northwestern University INFOCOM.
1 Routing and Scheduling in Web Server Clusters. 2 Reference The State of the Art in Locally Distributed Web-server Systems Valeria Cardellini, Emiliano.
Outline Web measurement motivation Challenges of web measurement Web measurement tools Current web measurements Web properties Web traffic data gathering.
An Engineering Approach to Computer Networking
An Analysis of Internet Content Delivery Systems Stefan Saroiu, Krishna P. Gommadi, Richard J. Dunn, Steven D. Gribble, and Henry M. Levy Proceedings of.
King : Estimating latency between arbitrary Internet end hosts Krishna Gummadi, Stefan Saroiu Steven D. Gribble University of Washington Presented by:
What’s a Web Cache? Why do people use them? Web cache location Web cache purpose There are two main reasons that Web cache are used:  to reduce latency.
CDNs & Replication Prof. Vern Paxson EE122 Fall 2007 TAs: Lisa Fowler, Daniel Killebrew, Jorge Ortiz.
Towards a Better Understanding of Web Resources and Server Responses for Improved Caching Craig E. Wills and Mikhail Mikhailov Computer Science Department.
Flash Crowds And Denial of Service Attacks: Characterization and Implications for CDNs and Web Sites Aaron Beach Cs395 network security.
Anycast Jennifer Rexford Advanced Computer Networks Tuesdays/Thursdays 1:30pm-2:50pm.
Drafting Behind Akamai (Travelocity-Based Detouring) Aleksandar Kuzmanovic Northwestern University Joint work with: A. Su, D. Choffnes, and F. Bustamante.
1 Drafting Behind Akamai (Travelocity-Based Detouring) AoJan Su, David R. Choffnes, Aleksandar Kuzmanovic, and Fabian E. Bustamante Department of Electrical.
Web Caching Robert Grimm New York University. Before We Get Started  Illustrating Results  Type Theory 101.
Internet-Scale Research at Universities Panel Session SAHARA Retreat, Jan 2002 Prof. Randy H. Katz, Bhaskaran Raman, Z. Morley Mao, Yan Chen.
Evaluation of the Proximity between Web Clients and their Local DNS Servers Z. Morley Mao UC Berkeley C. Cranor, M. Rabinovich,
1 Web Content Delivery Reading: Section and COS 461: Computer Networks Spring 2007 (MW 1:30-2:50 in Friend 004) Ioannis Avramopoulos Instructor:
Network-Aware Clustering of Web Clients Advanced IP Topics Seminar, Fall 2000 Supervisor: Anat Bremler Speaker: Zotenko Elena.
Evaluation of the Proximity between Web Clients and their Local DNS Servers Z. Morley Mao Chuck Cranor, Fred Douglis, Misha Rabinovich, Oliver Spatscheck,
Caching and Content Distribution Networks. Web Caching r As an example, we use the web to illustrate caching and other related issues browser Web Proxy.
CSCI-1680 Web Performance and Content Distribution Based partly on lecture notes by Scott Shenker and John Jannotti Rodrigo Fonseca.
Content Distribution Network (CDN) Performance Punit Shah CSE581 Internet Technologies OGI, OHSU 2002, Jan 16th.
Information-Centric Networks05a-1 Week 5 / Paper 1 On the use and performance of content distribution networks –Balachander Krishnamurthy, Craig Wills,
1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers.
Caching and Content Distribution Networks. Some Interesting Observations r Top 1 % of all documents account for 20% - 35% of proxy requests r Top 10%
On the Use and Performance of Content Distribution Networks Balachander Krishnamurthy Craig Wills Yin Zhang Presenter: Wei Zhang CSE Department of Lehigh.
Support Protocols and Technologies. Topics Filling in the gaps we need to make for IP forwarding work in practice – Getting IP addresses (DHCP) – Mapping.
Sipat Triukose, Zhihua Wen, Michael Rabinovich WWW 2011 Presented by Ye Tian for Course CS05112.
Redirection and Load Balancing
1. 1.Charting the CDNs(locating all their content and DNS servers). 2.Assessing their server availability. 3.Quantifying their world-wide delay performance.
{ Content Distribution Networks ECE544 Dhananjay Makwana Principal Software Engineer, Semandex Networks 5/2/14ECE544.
Oasis: Anycast for Any Service Michael J. Freedman Karthik Lakshminarayanan David Mazières in NSDI 2006 Presented by: Sailesh Kumar.
Krerk Piromsopa. Advance Net-Centric Computing Technology Krerk Piromsopa. Department of Computer Engineering. Chulalongkorn University.
Ao-Jan Su, David R. Choffnes, Fabián E. Bustamante and Aleksandar Kuzmanovic Department of EECS Northwestern University Relative Network Positioning via.
Application Measurements: Web Measurement. Motivation Web is the single most popular Internet application. Measurement can be very useful.
SAINT ‘01 Proactive DNS Caching: Addressing a Performance Bottleneck Edith Cohen AT&T Labs-Research Haim Kaplan Tel-Aviv University.
Application-Layer Anycasting By Samarat Bhattacharjee et al. Presented by Matt Miller September 30, 2002.
Overcast: Reliable Multicasting with an Overlay Network CS294 Paul Burstein 9/15/2003.
TCP/IP Protocols Dr. Sharon Hall Perkins Applications World Wide Web(HTTP) Presented by.
CDN Brokering* Presented By Nick Arnold Authors Alexandros Biliris, et. Al.
Chapter 29 Domain Name System (DNS) Allows users to reference computer names via symbolic names translates symbolic host names into associated IP addresses.
2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.
1 On the Placement of Web Server Replicas Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001,
An Efficient Approach for Content Delivery in Overlay Networks Mohammad Malli Chadi Barakat, Walid Dabbous Planete Project To appear in proceedings of.
Advanced Networking Lab. Given two IP addresses, the estimation algorithm for the path and latency between them is as follows: Step 1: Map IP addresses.
Hour 7 The Application Layer 1. What Is the Application Layer? The Application layer is the top layer in TCP/IP's protocol suite Some of the components.
1 On the Placement of Web Server Replicas Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001,
On Network-Aware Clustering of Web Clients Balachander Krishnamurthy AT&T Labs-Research, Florham Park, NJ, USA Jia Wang
A Light-Weight Distributed Scheme for Detecting IP Prefix Hijacks in Real-Time Lusheng Ji†, Joint work with Changxi Zheng‡, Dan Pei†, Jia Wang†, Paul Francis‡
1 A Framework for Measuring and Predicting the Impact of Routing Changes Ying Zhang Z. Morley Mao Jia Wang.
DYNAMIC LOAD BALANCING ON WEB-SERVER SYSTEMS by Valeria Cardellini Michele Colajanni Philip S. Yu.
Performance of Web Proxy Caching in Heterogeneous Bandwidth Environments IEEE Infocom, 1999 Anja Feldmann et.al. AT&T Research Lab 발표자 : 임 민 열, DB lab,
Geographic Locality of IP Prefixes Mythili Vutukuru Joint work with Michael Freedman, Nick Feamster and Hari Balakrishnan.
Drafting Behind Akamai (Travelocity-Based Detouring) Dr. Yingwu Zhu.
Information-Centric Networks Section # 5.3: Content Distribution Instructor: George Xylomenos Department: Informatics.
Content Distribution Network, Proxy CDN: Distributed Environment
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
Web Proxy Caching: The Devil is in the Details Ramon Caceres, Fred Douglis, Anja Feldmann Young-Ho Suh Network Computing Lab. KAIST Proceedings of the.
Content Distribution Networks (CDNs)
Overview on Web Caching COSC 513 Class Presentation Instructor: Prof. M. Anvari Student name: Wei Wei ID:
John S. Otto Mario A. Sánchez John P. Rula Fabián E. Bustamante Northwestern, EECS.
Drafting Behind Akamai (Travelocity-Based Detouring) Ao-Jan Su, David R. Choffnes, Aleksandar Kuzmanovic and Fabián E. Bustamante Department of EECS Northwestern.
Coral: A Peer-to-peer Content Distribution Network
Content Distribution Networks
Early Measurements of a Cluster-based Architecture for P2P Systems
On the Use and Performance of Content Distribution Networks
Presentation transcript:

Evaluation of the Proximity between Web Clients and their Local DNS Servers Z. Morley Mao UC Berkeley Chuck Cranor, Fred Douglis, Michael Rabinovich, Oliver Spatscheck, and Jia Wang AT&T Labs--Research

Motivation Content Distribution Networks (CDNs) Try to deliver content from servers close to users Current server selection mechanisms Uses Domain Name System (DNS) Assumes that clients are close to their local DNS servers – “orginator problem” Verify the assumption that clients are close to their local DNS servers

Measurement setup Three components 1x1 pixel embedded transparent GIF image A specialized authoritative DNS server Allows hostnames to be wild-carded An HTTP redirector Always responds with “302 Moved Temporarily” Redirect to a URL with client IP address embedded

Embedded image request sequence Client [ ] Redirector for xxx.rd.example.com Local DNS server Content server for the image Name server for *.cs.example.com 1. HTTP GET request for the image 2. HTTP redirect to IP cs.example.com 3. Request to resolve IP cs.example.com 4. Request to resolve IP cs.example.com 5. Reply: IP address of content server 6. Reply: content server IP address 7. HTTP GET request for the image 8. HTTP response

Measurement impact Image (43 Byte) embedded at the end of the page, requested last Keynote measurement LocationWithout imageWith imageIncreased overhead World wide % US % Average download latency (sec)

Measurement Data SiteParticipantImage hit count Duration 1att.com20,816,9272 months 2,3Personal pages (commercial domain)1,7433 months 4AT&T research212,8143 months 5-7University sites4,367,0763 months 8-19Personal pages (university domain)26,5633 months

Measurement statistics Data typeCount Unique client-LDNS associations4,253,157 HTTP requests25,425,123 Unique client IPs3,234,449 Unique LDNS IPs157,633 Client-LDNS associations where Client and LDNS have the same IP address56,086

Top 10 busy ASes by request count AS numberOrganizationRequest count 7018AT&T876, UUNET779, AT&T BMGS239,989 1BBN Planet225, Sprint153, IBM145, Level 3143, Earthlink110, RoadRunner107,115

Proximity metrics: 1. AS, 2. network clustering AS clustering Observes if client and LDNS belong to the same AS Network clustering Network cluster based on BGP routing information using longest prefix match Observes if client and LDNS belong to the same network cluster

Proximity metric: 3. traceroute divergence Probe machine client Local DNS server Use the last point of divergence Traceroute divergence: Max(3,4)= a b

Proximity metric: 4. Roundtrip time correlation Correlation between message roundtrip times from a probe site to the client and its LDNS server The probe site represents a potential cache server location A crude metric, highly dependent on the probe site

Aggregate statistics of AS/network clustering About 12,000 Ases Observed close to 80% total ASes 440,000 unique prefixes 25% of all possible network clusters Metrics# client clusters # LDNS clusters Total # clusters AS clustering9,2158,5909,570 Network clustering98,00153,321104,950

Proximity analysis results: AS, network clustering MetricsClient IPsHTTP requests AS cluster64%69% Network cluster16%24% AS clustering: coarse-grained Network clustering: fine-grained Most clients not in the same routing entity as their LDNS Clients with LDNS in the same cluster slightly more active

Proximity analysis results: Traceroute divergence Probe sites: NJ(UUNET), NJ(AT&T), Berkeley(calren), Columbus(calren) Sampled from top half of busy network clusters Median divergence: 4 Mean divergence: Ratio of common to disjoint path length 72%-80% pairs traced have common path at least as long as disjoint path

Improved local DNS configuration For client-LDNS associations not in the same cluster, does there exist a LDNS in client’s cluster? MetricsOriginalImprovedOriginalImproved AS cluster64%88%69%92% Network cluster16%66%24%70% Client IPsHTTP requests

Clients using multiple LDNS A single client IP can be associated using multiple LDNS First LDNS times out, second contacted LDNS assigned dynamically through DHCP server LDNS configuration with multiple IPs Client IP reused by different users Client IP is the address of NAT or proxy Misconfiguration Majority of clients are associated with a single LDNS – 78%

Clients using 10 or fewer LDNS # clients (% total) # LDNS (avg # NAC) % total HTTP requests % associations in client’s NAC 2,524,939 (78.1)1 (1) ,228 (16.1)2 (1.6) ,524 (3.8)3 (2.1) ,422 (1.3)4 (2.5) ,469 (0.4)5 (2.9) ,555 (9.1)6 (3.3) ,590 (0.049)7 (4.1) (0.022)8 (4.7) (0.014)9 (5.5) (0.008)10 (6.1)

Client IPs using large number of LDNSs Common domain names: ( LDNS) *.MIL, apnc*, *bbnplanet.com, *hsacorp.net, *webcache.rcn.net, cache*.webcache.rcn.net, cache0*.proxy.aol.com, cache.brightok.net, cache*.ruh.isu.net.sa, *.onenet.net, hh*.direcpc.com, cob-cache.r.state.mn.us, mango.arctic.net, netcache.net.ca.gov, proxy.*.netsetter.com, *.nortelnetworks.com, rad.afonline.net, *.prserv.net, *.cisco.com, ss*.co.us.ibm.com, thing5.csc.com, *.wwwcache.ja.net

Example client IP using large number of LDNSs Client (proxy.sjc.netsetter.com) Using 241 LDNS 753 requests Belong to marketscore.com: Offers free browser plug-in for web acceleration Using client’s LDNS to do name resolution on behalf of client? HTTP headers: Via header: NetCache Network Appliance X-forwarded-for: , Client-ip: client IP address (dialup customers)

Top LDNS serving most clients DNS name# clients servedOrganization Ns?.worldnet.att.net68000AT&T Ns1.us.prserv.net42000IBM Nscache3.eng00.mindspring.net23000mindspring Rns2.earthlink.net17000Earthlink Lax1-dns.lax.netzero.net13000netzero Dns1.mtry01.pacbell.net12000Pac bell Ns.mia.bellsouth.net12000Bellsouth Dialcache040.ns.uu.net11000UUNET

Examination of clients from individual ASes Organization (AS #)AS clusterNetwork clusterNo. Reqs AT&T (7018)10%4%876,741 UUNET (6172)96%1%614,341 BBN (1)63%48%225,368 Sprint (1239)70%37%153,225 IBM (2688)3%0.5%145,158 UCB (25)98%34%38,196 MIT (3)99% 6,341 Cornell (26)99%46%2,341 CMU (9)99%94%4,090 UTAustin (18)98%70%12,878

Impact on commercial CDNs Impact on server selection accuracy Look for clients With LDNS responds to queries With a cache server in client’s cluster Whether directed to a cache server in a different cluster? – “misdirected”

Impact on commercial CDNs AS clustering CDNCDN XCDN YCDN Z Clients with CDN server in cluster 1,679,5151,215,372618,897 Verifiable clients1,324,022961,382516,969 Misdirected clients (% of verifiable clients) (% of clusters occupied) 809,683 (60%) (92%) 752,822 (77%) (94%) 434,905 (82%) (94%) Clients with LDNS not in client’s cluster (% of misdirected clients) 443,394 (55%) 354,928 (47%) 262,713 (60%)

Impact on commercial CDNs Network clustering CDNCDN XCDN YCDN Z Clients with cache server in cluster 264,743156,507103,448 Verifiable clients221,440132,56790,264 Misdirected clients (% of verifiable clients) (% of clusters occupied) 154,198 (68%) (77%) 125,449 (94%) (82%) 87,486 (96%) (93%) Clients with LDNS not in client’s cluster (% of misdirected clients) 145,276 (94%) 116,073 (93%) 84,737 (97%)

Why choosing a cache in a different cluster? Even when both client and LDNS are in the same cluster? Possible reasons Load-balancing algorithms using different metrics E.g., network access costs Caches are different Clustering too coarse-grained CDN mapping inaccuracies?

Lessons from study of commercial CDNs AS hop count is a bad metric for closeness evaluation too coarse-grained Maybe better choosing a geographically closer cache server in a different AS For load-balancing, fault-tolerance, CDNs sometimes return cache servers in two different Ases

Related work Measurement methodology 1. IBM (Shaikh et al.) Time correlation of DNS and HTTP requests from DNS and Web server logs 2. Univ of Boston (Bestavros et al.) Assigning multiple IP addresses to a Web server Differences from our work: Our methodology: efficient, accurate, nonintrusive 3. Web bugs Proximity metrics Cisco’s Boomerang protocol: uses latency from cache servers to the LDNS

Conclusion Novel technique for finding client and local DNS associations Fast, non-intrusive, and accurate DNS based server selection works well for coarse-grained load-balancing 64% associations in the same AS 16% associations in the same NAC Server selection can be inaccurate if server density is high