Download presentation
Presentation is loading. Please wait.
Published byClaude Craig Modified over 6 years ago
1
Chapter 9 Applications (Naming) Outline Terminology
Domain Naming System Distributed File Systems 15-Jan-19 4/598N: Computer Networks
2
Overview What do names do? Name space identify objects
help locate objects define membership in a group specify a role convey knowledge of a secret Name space defines set of possible names consists of a set of name to value bindings 15-Jan-19 4/598N: Computer Networks
3
Names versus addresses Location transparent versus location-dependent
Properties Names versus addresses Location transparent versus location-dependent Flat versus hierarchical Global versus local Absolute versus relative By architecture versus by convention Unique versus ambiguous 15-Jan-19 4/598N: Computer Networks
4
Examples Hosts Files Users cheltenham.cs.princeton.edu 192.12.69.17
:23:A8:33:5B:9F Files /usr/llp/tmp/foo (server, fileid) Users Larry Peterson 15-Jan-19 4/598N: Computer Networks
5
Examples (cont) Mailboxes Services
nearby ps printer with short queue and 2MB 15-Jan-19 4/598N: Computer Networks
6
Domain Naming System Hierarchy Name chinstrap.cs.princeton.edu
com gov mil org net uk fr princeton ■ ■ ■ mit cisco ■ ■ ■ yahoo nasa ■ ■ ■ nsf arpa ■ ■ ■ navy acm ■ ■ ■ ieee cs ee physics ux01 ux04 15-Jan-19 4/598N: Computer Networks
7
Partition hierarchy into zones
Name Servers Partition hierarchy into zones edu com princeton ■ ■ ■ mit cs ee ux01 ux04 physics cisco yahoo nasa nsf arpa navy acm ieee gov mil org net uk fr Root name server Each zone implemented by two or more name servers Princeton Cisco ■ ■ ■ name server name server CS ■ ■ ■ EE name server name server 15-Jan-19 4/598N: Computer Networks
8
Resource Records Each name server maintains a collection of resource records (Name, Value, Type, Class, TTL) Name/Value: not necessarily host names to IP addresses Type NS: Value gives domain name for host running name server that knows how to resolve names within specified domain. CNAME: Value gives canonical name for particle host; used to define aliases. MX: Value gives domain name for host running mail server that accepts messages for specified domain. Class: allow other entities to define types TTL: how long the resource record is valid 15-Jan-19 4/598N: Computer Networks
9
Root Server (princeton.edu, cit.princeton.edu, NS, IN)
(cit.princeton.edu, , A, IN) (cisco.com, thumper.cisco.com, NS, IN) (thumper.cisco.com, , A, IN) … 15-Jan-19 4/598N: Computer Networks
10
Princeton Server (cs.princeton.edu, optima.cs.princeton.edu, NS, IN)
(optima.cs.princeton.edu, , A, IN) (ee.princeton.edu, helios.ee.princeton.edu, NS, IN) (helios.ee.princeton.edu, , A, IN) (jupiter.physics.princeton.edu, , A, IN) (saturn.physics.princeton.edu, , A, IN) (mars.physics.princeton.edu, , A, IN) (venus.physics.princeton.edu, , A, IN) 15-Jan-19 4/598N: Computer Networks
11
CS Server (cs.princeton.edu, optima.cs.princeton.edu, MX, IN)
(cheltenham.cs.princeton.edu, , A, IN) (che.cs.princeton.edu, cheltenham.cs.princeton.edu, CNAME, IN) (optima.cs.princeton.edu, , A, IN) (opt.cs.princeton.edu, optima.cs.princeton.edu, CNAME, IN) (baskerville.cs.princeton.edu, , A, IN) (bas.cs.princeton.edu, baskerville.cs.princeton.edu, CNAME, IN) 15-Jan-19 4/598N: Computer Networks
12
Name Resolution Strategies Local server forward iterative recursive
need to know root at only one place (not each host) site-wide cache 15-Jan-19 4/598N: Computer Networks
13
Distributed File Systems
No Transparency Global AFS: /cs.princeton.edu/usr/llp/tmp/foo Windows: f:/usr/llp/tmp/foo Transparency by Convention NFS: /usr/llp/tmp/foo Or Not: /n/fs/fac5/llp/tmp/foo Transparency by Architecture Sprite: /usr/llp/tmp/foo Private versus Shared ASF: /usr/llp/tmp/foo versus /afs/shared 15-Jan-19 4/598N: Computer Networks
14
Example a b c d e f g h i j k l m n o p q r s Prefix Domain / 1 /a/ 2
/ /a/ /d/ /d/k/ 4 a b c d e f g h i j k l m n o p q r s 15-Jan-19 4/598N: Computer Networks
15
Symbolic links and mount points Per-User and logical name spaces
Stupid Naming Tricks Symbolic links and mount points Per-User and logical name spaces Computed directories Load balancing and content distribution Attribute-based names Hash-based schemes 15-Jan-19 4/598N: Computer Networks
16
Peer-to-Peer Networks
Outline Survey Self-organizing overlay network File system on top of P2P network 15-Jan-19 CS 4614/598N: Computer Networks
17
Decentralized control Self-organization Symmetric communication
Background Distribution Decentralized control Self-organization Symmetric communication 15-Jan-19 4/598N: Computer Networks
18
Pioneers Academic Prototypes Examples Napster, Gnutella, FreeNet
Pastry, Chord, CAN,… 15-Jan-19 4/598N: Computer Networks
19
Organize, maintain overlay network
Common Issues Organize, maintain overlay network node arrivals node failures Resource allocation/load balancing Resource location Locality (network proximity) Idea: generic p2p substrate 15-Jan-19 4/598N: Computer Networks
20
Architecture P2p application layer self-organizing overlay network
Event notification Network storage ? P2p application layer self-organizing overlay network P2P Substrate TCP/IP Internet 15-Jan-19 4/598N: Computer Networks
21
Consistent hashing [Karger et al. ‘97]
Object Distribution Consistent hashing [Karger et al. ‘97] 128 bit circular id space nodeIds (uniform random) objIds (uniform random) Invariant: node with numerically closest nodeId maintains object objid nodeids 2 128 - 1 Each node has a randomly assigned 128-bit nodeId, circular namespace Basic operation: A message with key X, sent by any Pastry node, is delivered to the live node with nodeId closest to X in at most log16 N steps (barring node failures). Pastry uses a form of generalized hypercube routing, where the routing tables are initialized and updated dynamically. 15-Jan-19 4/598N: Computer Networks
22
Object Insertion/Lookup
O Msg with key X is routed to live node with nodeId closest to X Problem: complete routing table not feasible X Each node has a randomly assigned 128-bit nodeId, circular namespace Basic operation: A message with key X, sent by any Pastry node, is delivered to the live node with nodeId closest to X in at most log16 N steps (barring node failures). Pastry uses a form of generalized hypercube routing, where the routing tables are initialized and updated dynamically. Route(X) 15-Jan-19 4/598N: Computer Networks
23
Routing Properties log16 N steps O(log N) state 15-Jan-19
Each node has a randomly assigned 128-bit nodeId, circular namespace Basic operation: A message with key X, sent by any Pastry node, is delivered to the live node with nodeId closest to X in at most log16 N steps (barring node failures). Pastry uses a form of generalized hypercube routing, where the routing tables are initialized and updated dynamically. Properties log16 N steps O(log N) state 15-Jan-19 4/598N: Computer Networks
24
routing efficiency/robustness fault detection (keep-alive)
Leaf Sets Each node maintains IP addresses of the nodes with the L numerically closest larger and smaller nodeIds, respectively. routing efficiency/robustness fault detection (keep-alive) application-specific local coordination 15-Jan-19 4/598N: Computer Networks
25
Routing Procedure if (destination is within range of our leaf set)
forward to numerically closest member else let l = length of shared prefix let d = value of l-th digit in D’s address if (Rld exists) forward to Rld forward to a known node that (a) shares at least as long a prefix (b) is numerically closer than this node 15-Jan-19 4/598N: Computer Networks
26
Number of routing hops:
Integrity of overlay: guaranteed unless L/2 simultaneous failures of nodes with adjacent nodeIds Number of routing hops: No failures: < log16 N expected, 128/b + 1 max During failure recovery: O(N) worst case, average case much better 15-Jan-19 4/598N: Computer Networks
27
Node Addition 15-Jan-19 4/598N: Computer Networks
Each node has a randomly assigned 128-bit nodeId, circular namespace Basic operation: A message with key X, sent by any Pastry node, is delivered to the live node with nodeId closest to X in at most log16 N steps (barring node failures). Pastry uses a form of generalized hypercube routing, where the routing tables are initialized and updated dynamically. 15-Jan-19 4/598N: Computer Networks
28
Node Departure (Failure)
Leaf set members exchange keep-alive messages Leaf set repair (eager): request set from farthest live node in set Routing table repair (lazy): get table from peers in the same row, then higher rows 15-Jan-19 4/598N: Computer Networks
29
deliver(M): deliver message M to application
API route(M, X): route message M to node with nodeId numerically closest to X deliver(M): deliver message M to application forwarding(M, X): message M is being forwarded towards key X newLeaf(L): report change in leaf set L to application 15-Jan-19 4/598N: Computer Networks
30
PAST: Cooperative, archival file storage and distribution
Layered on top of Pastry Strong persistence High availability Scalability Reduced cost (no backup) Efficient use of pooled resources 15-Jan-19 4/598N: Computer Networks
31
Insert - store replica of a file at k diverse storage nodes
PAST API Insert - store replica of a file at k diverse storage nodes Lookup - retrieve file from a nearby live storage node that holds a copy Reclaim - free storage associated with a file Files are immutable 15-Jan-19 4/598N: Computer Networks
32
PAST: File storage fileId Insert fileId 15-Jan-19
PAST file storage is mapped onto the Pastry overlay network by maintaing the invariant that replicas of a file are stored on the k nodes that are numerically closest to the file’s numeric fileId. During an insert operation, an insert request for the file is routed using the fileId as the key. The node closest to fileId replicates the file on the k-1 next nearest nodes in then namespace. 15-Jan-19 4/598N: Computer Networks
33
(k is bounded by the leaf set size)
PAST: File storage fileId Insert fileId k=4 Storage Invariant: File “replicas” are stored on k nodes with nodeIds closest to fileId (k is bounded by the leaf set size) PAST file storage is mapped onto the Pastry overlay network by maintaing the invariant that replicas of a file are stored on the k nodes that are numerically closest to the file’s numeric fileId. During an insert operation, an insert request for the file is routed using the fileId as the key. The node closest to fileId replicates the file on the k-1 next nearest nodes in then namespace. 15-Jan-19 4/598N: Computer Networks
34
C k replicas PAST: File Retrieval Lookup
fileId file located in log16 N steps (expected) usually locates replica nearest client C The last point is shown pictorally here. A lookup request is routed in at most log16 N steps to a node that stores a replica, if one exists. In practice, the node among the k that first receives the message serves the file. Furthermore, network locality properties of Pastry (not discussed in this talk) ensure that this is node is usually the node that is closest to the client in the network !! 15-Jan-19 4/598N: Computer Networks
35
Content Distribution Networks
Outline Implementation Techniques Hashing Schemes Redirection Strategies 15-Jan-19 4/598N: Computer Networks
36
Caching Replication Design Space explicit
transparent (hijacking connections) Replication server farms geographically dispersed (CDN) 15-Jan-19 4/598N: Computer Networks
37
Traditional: Performance
Story for CDNs Traditional: Performance move content closer to the clients avoid server bottlenecks New: DDoS Protection dissipate attack over massive resources multiplicatively raise level of resources needed to attack 15-Jan-19 4/598N: Computer Networks
38
Denial of Service Attacks (DoS)
client client attacker client Classical c/s model, attacker tries to break balance. Flood servers or routers. or take advantage of exploits or bugs in OS, consequences resources are wasted on dealing with attack traffic, without doing useful work on normal clients’ requests. Put filter in the router, install firewall to stop attacks or try to identify culprits. server 15-Jan-19 4/598N: Computer Networks
39
Distributed DoS (DDoS)
slave attacker zombie client attacker client server Attacker become smart, stay behind the scene, compromise client hosts and render them to zombies At certain time point, instruct them to attack. From distributed locations, more resources. Results are the same. Concentrate on server side client 15-Jan-19 4/598N: Computer Networks
40
Redirection Overlay Geographically distributed server clusters R R R R
Internet Backbone R R Deploy geographically distributed server clusters. Logical front-ends. R R clients Distributed request-redirectors 15-Jan-19 4/598N: Computer Networks
41
Techniques DNS HTTP Router URL Rewriting
one name maps onto many addresses works for both servers and reverse proxies HTTP requires an extra round trip Router one address, select a server (reverse proxy) content-based routing (near client) URL Rewriting embedded links 15-Jan-19 4/598N: Computer Networks
42
Redirection: Which Replica?
Balance Load Cache Locality Network Delay 15-Jan-19 4/598N: Computer Networks
43
Hashing Schemes: Modulo
Easy to compute Evenly distributed Good for fixed number of servers Many mapping changes after a single server change svr0 URL (key) % Various hashing: computation time reassignment Classic hashing approach, not suitable for this environment, when server change happens, diminishing fraction of the documents keeps their same server assignment, add server will cause undesirable massive reassignment svrN 15-Jan-19 4/598N: Computer Networks
44
Consistent Hashing (CHash)
url-0 Hash server, then URL Closest match Only local mapping changes after adding or removing servers Used by State-of-the-art CDNs svr0 svr1 svr2 svrN Don’t talk too much detail If don’t stop at the closest server, it provides the order of accessing set of servers. url-1 Unit circle 15-Jan-19 4/598N: Computer Networks
45
Highest Random Weight (HRW)
URL weight0 Hash(url, svrAddr) Deterministic order of access set of servers Different order for different URLs Load evenly distributed after server changes sort weight1 RH svr0 svr1 svr2 svr svrN weight2 weight0 Basis for Cache Array Routing Protocol, hashing url with each server and then sorting the results Provides an order of accessing set of servers, original top, conceivably, use more than one by following the order Drawback is costly computation time O(NlogN) to generate the list, Can be alleviated by caching, if caching space is a concern, which can be further reduced by only keeping the top few list entries Similar to consistent hashing, reassignment is not a problem. It provides different order for different URL. weightN low 15-Jan-19 4/598N: Computer Networks
46
Redirection Strategies
Random (Rand) Requests randomly sent to cooperating servers Baseline case, no pathological behavior Replicated Consistent Hashing (R-CHash) Each URL hashed to a fixed # of server replicas For each request, randomly select one replica Replicated Highest Random Weight (R-HRW) Similar to R-CHash, but use HRW hashing Less likely two URLs have same set of replicas Using consistent hashing and highest random weight as one compenent 5 strategies, 3 are new. Random: scale with # servers since no pattern drawback working set size increases, performance increases as a higher fraction of the requests served from main mem, at a disadvantage vs. URL localtiy R-Chash: url is hashed to a point on the unit circle and replicas are evenly spaced starting from the original point, # replicas is fixed, but configurable, in later experiments, we will decide the optimal # empirically. R-HRW: counterpart of R-Chash, the ordered list of servers is determined by the HRW, then the top K servers treated as possible targets for the URL. DIFF: less likely two URLs get the same set of replicas, if less popular URLs have some overlapping servers with popular URLs, it is also likely they have some less-loaded replias This is a new scheme, but similar to R-CHash 15-Jan-19 4/598N: Computer Networks
47
Redirection Strategies (cont)
Coarse Dynamic Replication (CDR) Using HRW hashing to generate ordered server list Walk through server list to find a lightly loaded one # of replicas for each URL dynamically adjusted Coarse grained server load information Fine Dynamic Replication (FDR) Bookkeeping min # of replicas of URL (popularity) Let more popular URL use more replicas Keep less popular URL from extra replication 15-Jan-19 4/598N: Computer Networks
48
Identifying bottlenecks End-to-end network simulator prototype
Simulation Identifying bottlenecks Server overload, network congestion… End-to-end network simulator prototype Models network, application, and OS Built on NS + LARD simulators 100s of servers, 1000s of clients >60,000 req/s using full-TCP transport Measure capacity, latency, and scalability 15-Jan-19 4/598N: Computer Networks
49
S – Server, C – Client, R - Router
Network Topology C S S S R C C C R C R R R WA R R MI MA S IL C C C R R R PA CA NE R C DC CO R R S S R S GA SD CA S TX R C R S R C C C S – Server, C – Client, R - Router 15-Jan-19 4/598N: Computer Networks
50
Simulation Setup Workload Simulation process
Static documents from Web Server trace, available at each cooperative server Attackers from random places, repeat requesting a subset of random files Simulation process Gradually increase offered request load End when servers very heavily overloaded 15-Jan-19 4/598N: Computer Networks
51
A single server can handle ~600 req/s in simulation
Capacity: 64 server case Normal Operation FDR-Ideal: Attack: 25% of 1000 clients are zombies 10 URLs average 6KB. 76% - 94% from 61% absolute numbers are higher, hit rate relatively small Normal: static replication almost doubles the capacity compared to Rand, dynamic replication 61% over static replication, 245% rand achieve much better performance with only local knowledge. A single server can handle ~600 req/s in simulation 15-Jan-19 4/598N: Computer Networks
52
Capacity: 64 server case Under Attack (250 zombies, 10 files, avg 6KB)
FDR-Ideal: Seen benefit of FDR Attack: 25% of 1000 clients are zombies 10 URLs average 6KB. 76% - 94% from 61% absolute numbers are higher, hit rate relatively small A single server can handle ~600 req/s in simulation 15-Jan-19 4/598N: Computer Networks
53
Latency: 64 Servers Under Attack
Random’s Max: 11.2k req/s R-CHash Max: 19.8k req/s Median Rand :1.34 R-Chash: 0.5 R-HRW: 0.5 CDR : 0.5 FDR : 0.5 15-Jan-19 4/598N: Computer Networks
54
Latency At CDR’s Max: 35.1k req/s
Benefit of FDR over CDR Finer control of FDR 15-Jan-19 4/598N: Computer Networks
55
Under Attack (250 zombies, 10 files)
Capacity Scalability Normal Operation Under Attack (250 zombies, 10 files) Why non-linear 15-Jan-19 4/598N: Computer Networks
56
Various Attacks (32 servers)
1 victim file, 1 KB 10 victim files, avg 6KB Another scenario is to randomly select a wide range of URLs, in the case that theses URLs are valid, the dynamic schemes will degenerate to one server per URL. This is desirable behavior high hit rate, In the case of invalid urls forward to original server and bring it down, throttling the # URL-misses they forward, in case of reverse proxy. Small group of clients 15-Jan-19 4/598N: Computer Networks
57
Servers join DDoS protection overlay
Deployment Issues Servers join DDoS protection overlay Same story as Akamai Get protection and performance Clients use DDoS protection service Same story as proxy caching Incrementally deployable Get faster response and help others 15-Jan-19 4/598N: Computer Networks
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.