15-744: Computer Networking

15-744: Computer Networking
L-10 Evolving the Data Plane

Adding New Functionality to the Internet
Adding new functionality to the data plane Future internet architectures Overlay networks Active networks OpenFlow’s next generation  P4 Reading P4: Programming Protocol-IndependentPacket Processors Active network vision and reality:lessons from a capsule-based system Optional XIA: Efficient Support for Evolvable Internetworking A Case for End System Multicast

A History of Internet Evolution
Many success stories 1983  Flag day switch from NCP to IP 1988  /etc/hosts to DNS 1996  TCP SACK  EGP to BGP [1..4] 1992 Danzig study shows serious problems with DNS

But also many failures 1986  IP Multicast 1997  IntServ 1998  DiffServ 1995  IPv6 Internet capabilities, traceback schemes, explicit congestion control, content-oriented processing, …

Applications Innovation both above and below IP IP is well-known as the narrow waist of the Internet. Applications use this network layer to communicate with others on the Internet, while underlying technologies such as the physical layer and the link layer only need to support the IP interface. This narrow waist design has allowed a great deal of innovation both above and below IP. ---- However, it’s a big question whether IP itself can enjoy such innovation. While there has been an increasing demand for diverse communication types such as service and content, IP is primarily designed and used for host-based communication. IP Technology Hard to change IP …especially after 1990

Clean-Slate vs. Evolutionary
Successes of the 80s followed by failures of the 90’s IP Multicast QoS RED (and other AQMs) ECN … Concern that Internet research was dead Difficult to deploy new ideas What did catch on was limited by the backward compatibility required

Outline Active Networks Overlay Routing
Architectures for Incremental Deployment P4

Why Active Networks? Traditional networks route packets looking only at destination Also, maybe source fields (e.g. multicast) Problem: Rate of deployment of new protocols and applications is too slow Applications that do more than IP forwarding Firewalls Web proxies and caches Transcoding services Nomadic routers (mobile IP) Transport gateways (snoop) Reliable multicast (lightweight multicast, PGM)

Active Networks Nodes (routers) receive packets:
Perform computation based on their internal state and control information carried in packet Forward zero or more packets to end points depending on result of the computation Users and apps can control behavior of the routers End result: network services richer than those by the simple IP service model

Variations on Active Networks
Programmable routers (e.g. OpenFlow) More flexible than current configuration mechanism For use by administrators or privileged users Active control (e.g. SDN) Forwarding code remains the same Useful for management/signaling/measurement of traffic “Active networks” Computation occurring at the network (IP) layer of the protocol stack  capsule based approach Programming can be done by any user Source of most active debate

Case Study: MIT ANTS System
Conventional Networks: All routers perform same computation Active Networks: Routers have same runtime system Tradeoffs between functionality, performance and security

System Components Capsules Active Nodes: Code Distribution Mechanism
Execute capsules of protocol and maintain protocol state Provide capsule execution API and safety using OS/language techniques Code Distribution Mechanism Ensure capsule processing routines automatically/dynamically transfer to node as needed

Capsules Each user/flow programs router to handle its own packets
Code sent along with packets Code sent by reference Protocol: Capsules that share the same processing code May share state in the network Capsule ID (i.e. name) is MD5 of code

Type Dependent Header Files
Capsules Active Node IP Router Active Node Capsule Capsule IP Header Version Type Previous Address Type Dependent Header Files Data ANTS-specific header Capsules are forwarded past normal IP routers

Capsules Request for code Active Node 1 IP Router Active Node 2 Capsule Capsule When node receives capsule uses “type” to determine code to run What if no such code at node? Requests code from “previous address” node Likely to have code since it was recently used

Capsules Code is transferred from previous node Size limited to 16KB
Code Sent Active Node 1 IP Router Active Node 2 Capsule Capsule Code is transferred from previous node Size limited to 16KB Code is signed by trusted authority (e.g. IETF) to guarantee reasonable global resource use

Research Questions Execution environments
What can capsule code access/do? Safety, security & resource sharing How isolate capsules from other flows, resources? Performance Will active code slow the network? Applications What type of applications/protocols does this enable?

Functions Provided to Capsule
Environment Access Querying node address, time, routing tables Capsule Manipulation Access header and payload Control Operations Create, forward and suppress capsules How to control creation of new capsules? Storage Soft-state cache of app-defined objects

Safety, Resource Mgt, Support
Provided by mobile code technology (e.g. Java) Resource Management: Node OS monitors capsule resource consumption Support: If node doesn’t have capsule code, retrieve from somewhere on path

Applications/Protocols
Limitations Expressible  limited by execution environment Compact  less than 16KB Fast  aborted if slower than forwarding rate Incremental  not all nodes will be active Proof by example Host mobility, multicast, path MTU, Web cache routing, etc.

Discussion Active nodes present lots of applications with a desirable architecture Key questions Is all this necessary at the forwarding level of the network? Is ease of deploying new apps/services and protocols a reality?

Proposed -Centric Networking
Content: Named Data Networking Mobility: MobilityFirst Cloud: Nebula We are not the first to point out the lack of support for new communication types on the current Internet. There are several proposals to promote a certain communication type as the first-class. As we saw in this NSDI, Serval provides service-centric networking. For content, Named Data Networking provides content-centric networking by introducing a content cache on every router. They are good at achieving their own goal of supporting a new communication type. ---- However, I would like to point out that when an Internet architecture elevates one type of communication, support for other communication types may become less effective. This very problem has occurred to IP, as its host-oriented design hinders using other communication types. This raises a question: Is there a way to support multiple communication types, such as host, service, content, or mobility, on a single Internet architecture? Problem: Focusing on one communication type may hinder using other communication types, as occurred to IP Can we support heterogeneous communication types on a single Internet architecture?

Future -Centric Networking
Service, content, mobility, and cloud did not receive much attention before Yet more networking styles may be useful in the future E.g., DTN, wide-area multicast, …? Also, we cannot just consider the set of communication types already known today, but we also need to consider new types of communication that may occur in the future. We did not anticipate the high popularity of service and content or mobility a long time ago, but they have become more important nowadays, and this strongly suggests that we are likely to see exotic communication types, such as DTN, becoming useful in the future, and that we will discover completely new communication types as well. ---- From the long history of the IPv6 deployment, we have learned that making any changes to the deployed Internet architecture is very challenging. Then, when we design a new Internet architecture, how can we make it to allow new communication types without redesigning the architecture? Problem: Introducing additional communication types to the existing network can be very challenging Can we support future communication types without redesigning the Internet architecture?

Legacy Router May Prevent Innovation
Internet “I got a computer with Awesome-Networking announced at Sigcomm 2022! Can I use it right now?” “Ouch, we just replaced all of our routers built in Can you wait for another 10 years for new routers?” The previous question leads to another practical issue. When the network can support a new communication type, how can we make legacy routers to seamlessly handle the network traffic using the new communication type? For example, suppose we have a latest computer that can talk a new networking style. Then, should we wait for the entire network to be upgraded to support the new networking style, or can we just use the new networking style right now? ---- This problem calls for a solution to allow using a new communication type even before all routers in the network natively support it. Problem: Using a new communication type may require every legacy router in the network to be upgraded Can we allow use of a new communication type even when the network is yet to natively support it?

XIA’s Goals and Design Pillars
“Principal types” “Fallbacks” Allow using new communication types at any point (incremental deployment) Support multiple communication types concurrently (heterogeneity) We have seen three problems: supporting multiple communication types on a single Internet architecture, supporting additional communication types without redesigning this Internet architecture, and allowing using new communication types without waiting for the entire network to support them. ---- In XIA, we have two design pillars to answer these questions. The first design pillar is called principal types. The concept of principal types effectively allows multiple narrow waists on the Internet, so XIA can handle a heterogeneous and evolvable set of communication types. The second design pillar of XIA is fallbacks. By using fallbacks, end-hosts can always use new communication types to express intent, without worrying about whether the network can understand the new communication type. The end-hosts still can transparently benefit from any in-network support for the new communication types when it is eventually deployed to the network. Support future communication types (flexibility)

Define your own communication model
Principal Types Define your own communication model Let’s discuss principal types, the first design pillar of XIA. The use of principal types allows you to define your own communication model in XIA.

Principals Current Internet XIA Principal type
Type-specific identifier IP address The current Internet has one principal type, the IP address, which identifies a host. In contrast, in XIA, each principal specifies its type, and within each type, principals are identified by using type-specific identifiers. ---- For host principals, XIA uses a host-type and the hash of the host’s public key to uniquely identify the host. As XIA allows different types of principals, it can represent service and content as well. Service principals are similar to host principals, except they use the service’s public key shared by multiple hosts handling the service. Content principals use the hash of the content itself, which allows unique identification of the content. You can add a new principal type to describe entities in a future communication type, and you can define the meaning of the identifier for this principal type. Hash of host’s public key Host 0xF63C7A4… Hash of service’s public key Service 0x8A37037… Content 0x47BF217… Hash of content Future …

Principal Definition 1: Address Allocation and Intrinsic Security
XIA uses self-certifying identifiers that guarantee security properties for communication operation Host ID is a hash of its public key – accountability (AIP) Content ID is a hash of the content – correctness Does not rely on external configuration Intrinsic security is specific to the principal type Example: retrieve content using … Content XID: content is correct Service XID: the right service provided content Host XID: content was delivered from right host New Principal types can be added from the end-host.

Principal Definition 2: Type-Specific Semantics
Host 0xF63C7A4… Contact a host These principal types can have different semantics. When sending a packet, we can use a host principal as the destination address in order to contact a particular host, similar to IP. ---- To use a service, we can simply use a service principal type as the destination. We do not have to specify which host should serve it. A hypothetical service-enabled network may be able to route directly to the service. To retrieve content, a content principal can be used. In this case, any entity who possesses the content in the network can respond to the content request. Service 0x8A37037… Use a service Content 0x47BF217… Retrieve content

Principal Definition 3: Type-Specific Processing
Common processing Host-specific processing Service-specific processing Input Output XIA routers are designed to support such diverse communication types. The routers can have a separate processing logic specific to each principal type, fulfilling different needs from different communication types. For instance, services may require support for host load balancing or service migration between multiple hosts. Content-based communication may require content caching, which overhears and stores the content traffic and uses it later to serve content requests locally. Content-specific processing … XIA router Type-specific processing examples Service: load balancing or service migration Content: content caching

Routers with Different Capabilities
Routers are not required to support every principal type The only requirement: Host-based communication As we mentioned, routers may be upgraded at different times. As a result, XIA routers are not required to handle every principal type, as long as they support host-based communication. Some routers may support service, and the other routers may support content communication instead. We expect such diversity in routers’ capability will be common. Common Host Common Host Common Host Service Content Host-only router Service-enabled router Content-enabled router

Using Principal Types that are Not Understood by Legacy Routers?
Content-enabled router Content-enabled router Then, here is one scenario we may observe frequently. There are two hosts who want to use content principals. If the entire network consists of modern content-enabled routers, there is no problem. ---- However, what if there is any legacy router in the middle of the path? This communication will fail, because this legacy router does not understand content principals. So, the diversity of routers seems to cause a reachability problem for new principal types. As this situation, if some routers understand a new communication type while the others do not, how could end-point applications use the new communication type as they want? Legacy router without content support Want to communicate using content principals

Tomorrow’s communication types… today!
Fallbacks Tomorrow’s communication types… today! To solve this problem with legacy routers, we use a mechanism called fallbacks, the second design pillar of XIA. The use of fallbacks allows you to use tomorrow’s communication types today, without waiting for the entire network to natively support the new communication types.

Fallbacks: Alternative Ways for Routers to Fulfill Intent of Packet
Intent: Retrieve Content Fallback: Contact , who understands request Host The fallbacks are provided by end hosts, and these fallbacks are alternative ways to achieve the intent of the packet. In this example of fallbacks, the intent is to retrieve content, and a host principal is specified as a fallback. ---- Some network will be able to understand and directly use the content principal. However, if the router is not content-ready, the router will look at the fallback and use host-based communication that the legacy router can easily understand. Accordingly, by supplying appropriate fallbacks, end-point applications can directly specify the intent in the packet, and the delivery of the packet will be still guaranteed. Now, the next question is how to encode intent and fallbacks in packets to allow great flexibility for diverse communication styles while maintaining good packet processing performance. Content What the network does: With content-enabled routers, use for routing Otherwise, use for routing (always succeeds) Content Host

Your address is more than a number
DAG-Based Address Your address is more than a number Here, to represent an Internet address, we propose to use a direct acyclic graph or DAG. Our DAG-based addressing scheme provides a unified and expressive way to instruct how routers should route the packets.

DAG (Direct Acyclic Graph)-Based Addressing Enables Fallbacks
Packet sender Intent Routing choice Content To explain DAG-based addressing in XIA, I’ll reuse the previous example, where the intended destination of the packet is a content principal, and the fallback is a host principal. ---- First of all, in every DAG address, we put a source node to indicate a packet sender. Then, we add a node to represent the intent of the packet, the content principal. We connect the source node to the intent node, to indicate a routing choice for the routers. That is, the routers will attempt to use the content principal for forwarding the packet. Now, let’s add a fallback, the host principal. This node is connected from the packet sender node, but now this lower edge has a lower priority than the upper edge. It means that the router should prefer the upper edge for routing, but it may choose the lower edge if the router does not understand the content principal. So, using multiple outgoing edges from a node in such a way realizes the fallback mechanism. The host principal node is connected to the content principal node, meaning that the host knows how to handle the content request. This is how XIA represents intent and fallbacks as a single Internet address in a compact and flexible form. Host Another routing choice (with lower priority) This host knows how to handle content request Fallback

In the Beginning… OpenFlow was simple A single rule table
Priority, pattern, actions, counters, timeouts Matching on any of 12 fields, e.g., MAC addresses IP addresses Transport protocol Transport port numbers

Over the Past Five Years…
Proliferation of header fields Version Date # Headers OF 1.0 Dec 2009 12 OF 1.1 Feb 2011 15 OF 1.2 Dec 2011 36 OF 1.3 Jun 2012 40 OF 1.4 Oct 2013 41 OF 1.4 did stop for lack of wanting more, but just to put on the breaks. This is natural and a sign of success of OpenFlow: enable a wider range of controller apps expose more of the capabilities of the switch E.g., adding support for MPLS, inter-table meta-data, ARP/ICMP, IPv6, etc. New encap formats arising much faster than vendors spin new hardware Multiple stages of heterogeneous tables Still not enough (e.g., VXLAN, NVGRE, STT, …)

“Classic” OpenFlow (1.x)
SDN Control Plane Installing and querying rules Target Switch

“OpenFlow 2.0” SDN Control Plane Compiler Target Switch Configuring:
Parser, tables, and control flow Populating: Installing and querying rules Two modes: (i) configuration and (ii) populating Compiler configures the parser, lays out the tables (cognizant of switch resources and capabilities), and translates the rules to map to the hardware tables The compiler could run directly on the switch (or at least some backend portion of the compiler would do so) Compiler Parser & Table Configuration Rule Translator Target Switch

Abstract Model Reconfigurable parser + Non-sequential match-action + Protocol independent processing

Simple Motivating Example
Data-center routing Top-of-rack switches Two tiers of core switches Source routing by ToR Hierarchical tag (mTag) Pushed by the ToR Four one-byte fields Two hops up, two down up2 down1 up1 down2 ToR ToR

Header Formats Header Ordered list of fields
A field has a name and width header ethernet { fields { dst_addr : 48; src_addr : 48; ethertype : 16; } header vlan { fields { pcp : 3; cfi : 1; vid : 12; ethertype : 16; } header mTag { fields { up1 : 8; up2 : 8; down1 : 8; down2 : 8; ethertype : 16; }

Parser State machine traversing the packet
Extracting field values as it goes parser start { parser vlan { ethernet; switch(ethertype) { } case 0xaaaa : mTag; case 0x800 : ipv4; parser ethernet { switch(ethertype) { } case 0x8100 : vlan; case 0x9100 : vlan; parser mTag { case 0x800 : ipv4; switch(ethertype) { case 0x800 : ipv4; } } } }

Typed Tables Describe each packet-processing stage
What fields are matched, and in what way What action functions are performed (Optionally) a hint about max number of rules Compiler uses this to - determine what kind of memory (width, height), and what kind (TCAM, SRAM) - reject rules that violate the type The “add_mTag” is an “action function”. In other kinds of tables, multiple different action functions may be allowed. table mTag_table { reads { ethernet.dst_addr : exact; vlan.vid : exact; } actions { add_mTag; max_size : 20000;

Action Functions Custom actions built from primitives
Add, remove, copy, set, increment, checksum action add_mTag(up1, up2, down1, down2, outport) { add_header(mTag); copy_field(mTag.ethertype, vlan.ethertype); set_field(vlan.ethertype, 0xaaaa); set_field(mTag.up1, up1); set_field(mTag.up2, up2); set_field(mTag.down1, down1); set_field(mTag.down2, down2); set_field(metadata.outport, outport); } Push mTag on the packet Copy the ethertype of the VLAN header to correctly identify the next level of header Make the VLAN header point to the mTag Set the four fields of the mTag Ensure the packet goes to the right egess port

Control Flow Flow of control from one table to the next
Collection of functions, conditionals, and tables For a ToR switch: Source check (one rule per port): verify mTag only on ports to the core (actions: fault, strip and record metadata, pass) Local switching: checks if destination is local, and otherwise goes to mTag table mTag table: add mTag Egress check: has someone set the outport ToR From core (with mTag) From local hosts (with no mTag) Source Check Table Local Switching Egress mTag Miss: Not Local

Control Flow Flow of control from one table to the next
Collection of functions, conditionals, and tables Simple imperative representation control main() { table(source_check); if (!defined(metadata.ingress_error)) { table(local_switching); if (!defined(metadata.outport)) { table(mTag_table); } table(egress_check); Source check (one rule per port): verify mTag only on ports to the core (actions: fault, strip and record metadata, pass) Local switching: checks if destination is local, and otherwise goes to mTag table mTag table: add mTag Egress check: verify egress is resolved, do not retag packets received with tag, reads egress, etc.

Overlay Routing Basic idea: Tunnels Why?
Treat multiple hops through IP network as one hop in “virtual” overlay network Run routing protocol on overlay nodes Basic technique: tunnel packets Tunnels IP-in-IP encapsulation Poor interaction with firewalls, multi-path routers, etc. Why? For performance – can run more clever protocol on overlay For functionality – can provide new features such as multicast, active processing, IPv6

Overlay for Performance [S+99]
Why would IP routing not give good performance? Policy routing – limits selection/advertisement of routes Early exit/hot-potato routing – local not global incentives Lack of performance based metrics – AS hop count is the wide area metric How bad is it really? Look at performance gain an overlay provides

Quantifying Performance Loss
Measure round trip time (RTT) and loss rate between pairs of hosts ICMP rate limiting Alternate path characteristics 30-55% of hosts had lower latency 10% of alternate routes have 50% lower latency 75-85% have lower loss rates

Overlay for Features How do we add new features to the network?
Does every router need to support new feature? Choices Reprogram all routers  active networks Support new feature within an overlay Examples: IP V6 & IP Multicast Tunnels between routers supporting feature Mobile IP Home agent tunnels packets to mobile host’s location

Overlay Example: Multicast
Unicast: one source to one destination Multicast: one source to many destinations Two main functions: Efficient data distribution Logical naming of a group

Example Applications Broadcast audio/video Push-based systems
Software distribution Web-cache updates Teleconferencing (audio, video, shared whiteboard, text editor) Multi-player games Server/service location Other distributed applications

Multicast – Efficient Data Distribution
Src Src

Multicast Router Responsibilities
Learn of the existence of multicast groups (through advertisement) Identify links with group members Establish state to route packets Replicate packets on appropriate interfaces Routing entry: Src, incoming interface List of outgoing interfaces

Supporting Multicast on the Internet
Application ? At which layer should multicast be implemented? In the hourglass Internet architecture, IP is the compatibility layer in the Internet architecture. All hosts must implement IP Two choices multicast at IP or application: only a subset, customizability One important architecture question is, at which layer should multicast be implemented. The convention wisdom has been to support multicast in the IP layer for efficiency and performance reasons. However, more than 10 years since this is proposed, it still has not been widely deployed. This paper revisits this question with emphasis on Internet evaluation. In particular, we show that multicast at the application layer can be efficient compared to IP Multicast. ? IP Network Internet architecture

IP Multicast Highly efficient Good delay MIT Berkeley UCSD CMU routers
In the IP Multicast architecture, - routers replicate multicast pkts inside the network to all receivers - motivation: highly efficiency - highly efficient: single pkt that traverse each physical link - delay is good from source to all receivers CMU routers end systems multicast flow Highly efficient Good delay

End System Multicast Overlay Tree MIT1 MIT Berkeley MIT2 UCSD CMU1 CMU
Recently, we and others have advocated for an alternative architecture, where all multicast functionality, including pkt replication and group management are pushed to end systems. - We call this architecture End System Multicast - In this architecture, end system organize themselves into an overlay tree root at the source - data is sent along the overlay tree. - It is an overlay in the sense that each link in the overlay tree corresponds to a physical path in the underlying network CMU CMU2 UCSD MIT1 MIT2 CMU2 Overlay Tree Berkeley CMU1

Potential Benefits Over IP Multicast
Quick deployment All multicast state in end systems Computation at forwarding points simplifies support for higher level functionality In ESM, there is no additional support from the routers except unicast, so the deployment can be immediate. In IP Multicast, routers need to maintain per-group state. - This causes concerns regards to scalability to number of groups. - In ESM, all multicast state is pushed to end systems. Finally, computation at end systems can potentially simplify support for higher-layer functionality, - such as congestion control and reliability, - as well as application-specific customization, such as XXX. These high-layer functionality is harder to achieve in IP Multicast because - the data splitting points are at the routers and they are constraint in computation. MIT1 MIT Berkeley MIT2 UCSD CMU1 CMU CMU2

Concerns with End System Multicast
Self-organize recipients into multicast delivery overlay tree Must be closely matched to real network topology to be efficient Performance concerns compared to IP Multicast Increase in delay Bandwidth waste (packet duplication) However, there are several concerns with ESM. - In the absence of support from the routers, there is a challenge to coordinate among end systems to construct efficient overlay tree. Even with an efficient overlay tree, … - there can be an increase in delay because data can travel through multiple overlay hops, as shown for MIT2. In IP Multicast, data is sent directly along the forwarding path. - moreover, there can be waste in bandwidth because duplicated pkts can traverse the same physical links, as shown in links near UCSD. In IP Multicast, a single pkt traverse each physical link at most once. MIT2 Berkeley MIT1 UCSD CMU2 CMU1 IP Multicast MIT2 Berkeley MIT1 CMU1 CMU2 UCSD End System Multicast

Next Lecture Middleboxes NFV Readings:
Network Functions Virtualisation (skim) Design and Implementation of a Consolidated Middlebox Architecture (read whole paper)

15-744: Computer Networking

Similar presentations

Presentation on theme: "15-744: Computer Networking"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

15-744: Computer Networking

Similar presentations

Presentation on theme: "15-744: Computer Networking"— Presentation transcript:

Similar presentations

About project

Feedback