15-829A/18-849B/95-811A/19-729A Internet-Scale Sensor Systems: Design and Policy Lecture 12 – Name Lookup
Lecture # Name Lookup What do names/descriptions look like? How is the searching done? What type of searches? Search for particular service, browse available services or find collection of services (composition)? Exact match queries or richer queries? Find any one matching instance or find all matching instances? Which instance to choose what are the right metrics
Lecture # Relevance Finding sensor data Alternatives to the XML hierarchical query processing in IrisNet Finding useful nodes Sensors Nodes for query processing Services’ web interface
Lecture # Naming and service discovery Wide-area naming DNS, Global Name Service, Grapevine Attribute-based systems X.500, Information Bus, Discover query routing Service location IETF SLP, Berkeley service discovery service Device discovery Jini, Universal plug-and-play Intentional Naming System (INS) Geographic Hash Tables
Lecture # Outline Overview INS GHT
Lecture # Name Types Flat names (single attribute-value pair) Basically a single string that identifies all entities E.g. Frank, Joe, Fred, etc… Good match with lookup techniques like DHTs Hierarchical names (single hierarchy of attribute-value pairs) Entities classified by category, subcategory, etc. E.g. Frank.Smith.American, Good match with lookup techniques like DNS
Lecture # Name Types Multiple hierarchical value-attribute pairs Type = printer memory = 32MB, lang = PCL Location = CMU building = WeH Hierarchy based on attributes or attributes-values? E.g. Country state or country=USA state=PA and country=Canada province=BC? Can be done in something like XML Good match with lookup techniques like INS Interface description Jini
Lecture # Query Types Exact match name=fred & Age = 32 Ranges name > fred & name < george Simple wildcards Name = * & Age = 32 Regular expression Name = fr*d
Lecture # Lookup Mechanisms (Multicast/Broadcast/Flooding) Services listen on well known discovery group address Client multicasts query to discovery group Services unicast replies to client IP Multicast Like a radio system Receivers subscribe to multicast groups (tune) Senders transmit with particular TTL to group (transmission power to particular frequency) Multicast not widely deployed due to scalability (and other) problems
Lecture # Lookup Mechanisms (Multicast/Broadcast/Flooding) Used by many systems – SLP, Jini, UPNP Tradeoffs Not very scalable effectively broadcast search Requires no dedicated infrastructure or bootstrap Easily adapts to availability/changes Can scope request by multicast scoping and by information in request
Lecture # Lookup Mechanisms (Directory Based/Centralized) Services register with central directory agent Soft state registrations must be refreshed or they expire Clients send query to central directory replies with list of matches Used by many systems – SLP, Jini, UPNP Typically central directory per domain How do directories interact? often don’t
Lecture # Lookup Mechanisms (Directory Based/Centralized) Tradeoffs How do you find the central directory service? Typically using multicast based discovery! SLP also allows directory to do periodic advertisements Need dedicated infrastructure Well suited for browsing and composition knows full list of services
Lecture # Lookup Mechanisms (Routing Based) Client issues query to overlay network Query can include both service description and actual request for service Overlay network routes query to desired service[s] How is overlay structured/created DNS administrative hierarchy DHT structured (circle, multi-dimensional torus, plaxton, etc.) INS self-organized network latency based Tradeoffs Routing on complex parameters can be difficult/expensive Can work especially well in ad-hoc networks Can late-binding really be used in many applications?
Lecture # Other Issues Security Don’t want others to serve/change queries Also, don’t want others to know about existence of services Srini’s home SLP server is advertising the $50,000 MP3 stereo system (come steal me!) Addressed in directory based systems through the use of capabilities certificates that grant access to particular service discovery records
Lecture # Outline Overview INS GHT
Lecture # Applications Location-dependent mobile applications Floorplan: An map-based navigation tool Camera: A mobile image/video service Load-balancing printer TV & jukebox service Sensor computing Network-independent “instant messaging”
Lecture # Environment Heterogeneous network with devices, sensors and computers Dynamism Mobility Performance variability Services “come and go” Services may be composed of groups of nodes Example applications Location-dependent mobile apps Network of mobile cameras Problem: resource discovery
Lecture # Responsiveness Integrate name resolution and message routing (late binding) Robustness Easy configuration Name resolvers self-configure into overlay network Expressiveness Decentralized, cooperating resolvers with soft-state protocol Design goals and principles Names are intentional; apps know what, not where
Lecture # Name-specifiers Expressive name language (like XML) Resolver architecture decoupled from language Providers announce descriptive names Clients make queries Attribute-value matches Wildcard matches Ranges [vspace = mit.edu/thermometer] [building = ne43 [floor = 5 [room = *]] [temperature < 60 0 F] data [vspace = lcs.mit.edu/camera] [building = ne43 [room = 510]] [resolution=800x600]] [access = public] [status = ready]
Lecture # Name Lookups Lookup Tree-matching algorithm AND operations among orthogonal attributes
Lecture # Resolver Network Resolvers exchange routing information about names Uses triggered updates to rapidly adapt to changes Decentralized construction and maintenance Implemented as an “overlay” network over UDP tunnels Not every node needs to be a resolver Too many neighbors causes overload, but need a connected graph Overlay link metric should reflect performance
Lecture # INS Architecture: Message routing using intentional names Name resolver Overlay network of resolvers Client Name Service
Lecture # Robustness Decentralized name resolution and routing in “serverless” fashion Names are weakly consistent, like network-layer routes Routing protocol with periodic & triggered updates to exchange names Routing state is soft Expires if not updated Robust against service/client failure No need for explicit de-registration
Lecture # vspace=cameravspace=5th-floor Delegate this to another INR Routing updates for all names Routing Protocol Scalability vspace = Set of names with common attributes Virtual-space partitioning: each resolver now handles subset of all vspaces Name-tree at resolver
Lecture # Lookups Two styles of message delivery Anycast Multicast Two types of lookup Early binding Late binding
Lecture # Lookup Types If query only contains description, subsequent interactions must be outside overlay (early- binding) Use IP address for subsequent messages If query includes request, client can send subsequent queries via overlay (late-binding) Subsequent requests may go to different services agents Enables easy fail-over/mobility of service
Lecture # Intentional Anycast lookup(name) yields all matches Resolver selects location based on advertised service-controlled metric E.g., server load Tunnels message to selected node Application-level vs. IP-level anycast Service-advertised metric is meaningful to the application
Lecture # ASIDE: Server Selection Service is replicated in many places in network How do direct clients to a particular server? As part of routing anycast, cluster load balancing As part of application HTTP redirect As part of naming DNS Which server? Lowest load to balance load on servers Best performance to improve client performance Based on Geography? RTT? Throughput? Load? Any alive node to provide fault tolerance
Lecture # ASIDE: Routing Based Server Selection Anycast Give service a single IP address Each node implementing service advertises route to address Packets get routed routed from client to “closest” service node Closest is defined by routing metrics May not mirror performance/application needs What about the stability of routes?
Lecture # Intentional Multicast Use intentional name as group handle Each resolver maintains list of neighbors for a name Data forwarded along a spanning tree of the overlay network Shared tree, rather than per-source trees Enables more than just receiver-initiated group communication
Lecture # INS Architecture: Message routing using intentional names Name resolver Overlay network of resolvers Client Intentional anycast Intentional multicast Name Service Late binding Name with message
Lecture # Discussion Distributed without relying on multicast Late-binding – how useful is this? Nice for fault recovery, but … Need stateless messaging and careful application design Soft-state critical to robustness of such designs Application level metrics for routing Handling dynamic attributes Difficult to scale with such attributes How to scale?
Lecture # Wide Area Scaling How do we scale INS to wide area? Hierarchy or DHTs? Hierarchy must be based on attribute of services All services must have this attribute All queries must include (implicitly or explicitly) this attribute Tradeoffs What attribute? Administrative (like DNS)? Geographic? Network Topologic? Should it have multiple hierarchies? Can support range queries nicely
Lecture # Wide Area Scaling INS over Chord TWINE DHTs what are the keys Must insert service at all possible lookup combinations One entry per each value-attribute pair for service What about popular pairs? e.g., country=USA Will overload nodes in DHT! Tradeoffs Load-balancing and updates difficult Search styles limited to exact match Robust to failures
Lecture # Outline Overview INS GHT
Lecture # Motivating Example Name-addressed, or data-centric, queries appropriate: Query(“whale”) {(whale, i, [u,v]), (whale, j, [x,y])} Expressiveness: single attribute name lookup
Lecture # Solution 1: Local Storage Broadcast query, collect results For n nodes, Q events, D q detected & queried events Total msg-links = Q * n + D q * sqrt(n) Hotspot (at access point) = Q + D q
Lecture # Solution 2: External Storage Collect all events For n nodes, Q events, D t total detected events Total msg-links = D t * sqrt(n) Hotspot (at access point) = D t But D t might be large
Lecture # Solution 3: Data-Centric Storage (DCS) Rendezvous for queries & data For n nodes, Q events, D t total detected events, D q detected & queried events Total msg-links = Q * sqrt(n) + D q * sqrt(n) + D t * sqrt(n) Hotspot (at access point) = Q + D q With summarization Total msg-links = Q * sqrt(n) + Q * sqrt(n) + D t * sqrt(n) Hotspot (at access point) = 2Q
Lecture # Tradeoffs Local storage has greatest total message count as n grows External storage always sends fewer messages than DCS When many more event types detected than queried for, DCS has least hotspot message count DCS permits summarization of events (return multiple events in one packet) Need a simple way to implement DCS
Lecture # Geographic Hash Table Two operations supported: Put(k;v) stores v, the event, according to key k Get(k) retrieves the value associated with key k Hash a key k into geographic coordinates; store and retrieve events for that key at that location Spreads load evenly across key space!
Lecture # Geographic Routing (GPSR) Greedy geographic routing – with fixes for empty spaces Routes data to nodes surrounding geographic destination Node closest stores data for that coordinate hash Replication on all nodes that enclose the coordinates to ensure persistence of data
Lecture # Discussion 3 key objectives scale, adaptive to change and energy efficient Traffic concentration, data concentration and message counts are good metric for scaling and energy efficiency Data must be stored persistently and retrievable consistently from same place to allow adaptation to change
Lecture # Next Lecture Lecturer: Srini Topic: positioning Real world: GPS, Radar, Bat, Cricket Network: IDMAPS, GNP, GeoPing Readings and questions posted on Saturday Announcements: Final exam: May 5 th 8:30 – 11:30AM Mar 4 th – project checkpoint