Project 3b Stateless, Passive Firewall (3a)

Slides:



Advertisements
Similar presentations
Data Center Networking with Multipath TCP
Advertisements

Finishing Flows Quickly with Preemptive Scheduling
1 o Two issues in practice – Scale – Administrative autonomy o Autonomous system (AS) or region o Intra autonomous system routing protocol o Gateway routers.
Deconstructing Datacenter Packet Transport Mohammad Alizadeh, Shuang Yang, Sachin Katti, Nick McKeown, Balaji Prabhakar, Scott Shenker Stanford University.
Mohammad Alizadeh, Albert Greenberg, David A. Maltz, Jitendra Padhye Parveen Patel, Balaji Prabhakar, Sudipta Sengupta, Murari Sridharan Modified by Feng.
Lecture 18: Congestion Control in Data Center Networks 1.
Mohammad Alizadeh, Albert Greenberg, David A. Maltz, Jitendra Padhye Parveen Patel, Balaji Prabhakar, Sudipta Sengupta, Murari Sridharan Presented by Shaddi.
Improving Datacenter Performance and Robustness with Multipath TCP Costin Raiciu, Sebastien Barre, Christopher Pluntke, Adam Greenhalgh, Damon Wischik,
PFabric: Minimal Near-Optimal Datacenter Transport Mohammad Alizadeh Shuang Yang, Milad Sharif, Sachin Katti, Nick McKeown, Balaji Prabhakar, Scott Shenker.
Congestion Control: TCP & DC-TCP Swarun Kumar With Slides From: Prof. Katabi, Alizadeh et al.
Balajee Vamanan et al. Deadline-Aware Datacenter TCP (D 2 TCP) Balajee Vamanan, Jahangir Hasan, and T. N. Vijaykumar.
Mohammad Alizadeh Adel Javanmard and Balaji Prabhakar Stanford University Analysis of DCTCP:Analysis of DCTCP: Stability, Convergence, and FairnessStability,
Advanced Computer Networking Congestion Control for High Bandwidth-Delay Product Environments (XCP Algorithm) 1.
Transport Layer 3-1 Fast Retransmit r time-out period often relatively long: m long delay before resending lost packet r detect lost segments via duplicate.
Defense: Christopher Francis, Rumou duan Data Center TCP (DCTCP) 1.
1 Internet Networking Spring 2003 Tutorial 11 Explicit Congestion Notification (RFC 3168) Limited Transmit (RFC 3042)
CSE331: Introduction to Networks and Security Lecture 7 Fall 2002.
1 Internet Networking Spring 2003 Tutorial 11 Explicit Congestion Notification (RFC 3168)
1 Spring Semester 2007, Dept. of Computer Science, Technion Internet Networking recitation #8 Explicit Congestion Notification (RFC 3168) Limited Transmit.
CS335 Networking & Network Administration Tuesday, April 20, 2010.
Jennifer Rexford Fall 2014 (TTh 3:00-4:20 in CS 105) COS 561: Advanced Computer Networks TCP.
Information-Agnostic Flow Scheduling for Commodity Data Centers
5/12/05CS118/Spring051 A Day in the Life of an HTTP Query 1.HTTP Brower application Socket interface 3.TCP 4.IP 5.Ethernet 2.DNS query 6.IP router 7.Running.
Congestion Control for High Bandwidth-Delay Product Environments Dina Katabi Mark Handley Charlie Rohrs.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Wide-Area Traffic Management COS 597E: Software Defined Networking.
Mohammad Alizadeh, Abdul Kabbani, Tom Edsall,
Computer Networks Layering and Routing Dina Katabi
Switching, routing, and flow control in interconnection networks.
Mohammad Alizadeh Stanford University Joint with: Abdul Kabbani, Tom Edsall, Balaji Prabhakar, Amin Vahdat, Masato Yasuda HULL: High bandwidth, Ultra Low-Latency.
TCP & Data Center Networking
Curbing Delays in Datacenters: Need Time to Save Time? Mohammad Alizadeh Sachin Katti, Balaji Prabhakar Insieme Networks Stanford University 1.
1 Flow Identification Assume you want to guarantee some type of quality of service (minimum bandwidth, maximum end-to-end delay) to a user Before you do.
CS244A Midterm Review Ben Nham Some slides derived from: David Erickson (2007) Paul Tarjan (2007)
Wei Bai with Li Chen, Kai Chen, Dongsu Han, Chen Tian, Hao Wang SING HKUST Information-Agnostic Flow Scheduling for Commodity Data Centers 1 SJTU,
Networking Fundamentals. Basics Network – collection of nodes and links that cooperate for communication Nodes – computer systems –Internal (routers,
Thoughts on the Evolution of TCP in the Internet (version 2) Sally Floyd ICIR Wednesday Lunch March 17,
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
TCP continued. Discussion – TCP Throughput TCP will most likely generate the saw tooth type of traffic. – A rough estimate is that the congestion window.
Jiaxin Cao, Rui Xia, Pengkun Yang, Chuanxiong Guo,
1 Transport Layer: Basics Outline Intro to transport UDP Congestion control basics.
TCP Traffic Characteristics—Deep buffer Switch
TCP/IP1 Address Resolution Protocol Internet uses IP address to recognize a computer. But IP address needs to be translated to physical address (NIC).
11 CS716 Advanced Computer Networks By Dr. Amir Qayyum.
MMPTCP: A Multipath Transport Protocol for Data Centres 1 Morteza Kheirkhah University of Edinburgh, UK Ian Wakeman and George Parisis University of Sussex,
Revisiting Transport Congestion Control Jian He UT Austin 1.
Data Center TCP (DCTCP)
Networking in Datacenters EECS 398 Winter 2017
Data Center TCP (DCTCP)
CIS 700-5: The Design and Implementation of Cloud Networks
6.888 Lecture 5: Flow Scheduling
Internet Networking recitation #9
Data Center Network Architectures
Chapter 3 outline 3.1 transport-layer services
Improving Datacenter Performance and Robustness with Multipath TCP
Intra-Domain Routing Jacob Strauss September 14, 2006.
Microsoft Research Stanford University
Hamed Rezaei, Mojtaba Malekpourshahraki, Balajee Vamanan
Switching, routing, and flow control in interconnection networks
Congestion Control in Software Define Data Center Network
AMP: A Better Multipath TCP for Data Center Networks
COS 561: Advanced Computer Networks
Internet Networking recitation #10
Distributed Systems CS
Centralized Arbitration for Data Centers
Backbone Traffic Engineering
Lecture 16, Computer Networks (198:552)
Lecture 17, Computer Networks (198:552)
Transport Layer: Congestion Control
TCP flow and congestion control
Distributed Systems CS
Presentation transcript:

Project 3b Stateless, Passive Firewall (3a) Stateful, Active Firewall (3b) In the first part of project 3, you designed a stateless, passive firewall. In this part, you will extend your firewall to make stateful and active.

What stays… Same framework Same VM Same test harness Same tools Same basic firewall behavior Most of what you’ve got used to in the first part of the project won’t change – including the framework, VM, test harness, tools and basic firewall behavior.

What’s new? Three new rules: deny tcp <IP address> <port> deny dns <domain name> log http <host name> In this part, you will implement three new rules.

Active! deny tcp <IP address> <port> SYN RST Client Server For the first kind of rule, you have to explicitly deny TCP connection requests from a specified client IP-port pair. On intercepting a SYN packet from such a source, your firewall must not only drop the packet, but respond with a RST packet so that the client doesn’t keep trying to connect. RST Client Server

Even more active! deny dns <domain name> DNS Reqest int ext DNS Response to 54.173.224.150

Stateful! log http <host name> Log HTTP transactions Request/Response pair log http *.berkeley.edu www-inst.eecs.berkeley.edu GET /~cs168/fa14/ HTTP/1.1 301 254 www-inst.eecs.berkeley.edu GET /~cs168/fa14/ HTTP/1.1 200 273 www-inst.eecs.berkeley.edu GET /favicon.ico HTTP/1.1 200 0 ...

Stateful! Handle: My advice? segmented TCP packets headers spanning multiple packets packet drops/re-orderings and more… (read project spec!) My advice? Start early

What we won’t check… Firewall rules from 3a Country-based matching Takeaway? Don’t worry too much if your solution for 3a wasn’t complete But make sure the solution to this one is! We won’t check the most of what you were supposed to do in 3a, including the old firewall rules and country based matching. This makes sure the grading for 3b is decoupled from the grading for 3a. Don’t worry too much if your solution to 3a was incomplete, but ensure you complete this one.

NO CHEATING WE RUN COPY CHECKER The final reminder: please do not copy codes from others, we will run copy checker for every submission.

Logistics GSIs: Additional OH: Anurag, Shoumik and Sangjin Additional OH: Will be announced on Piazza Slides, specs (code framework same as 3a): Online midnight today Due: At noon on Dec 3rd The project spec and code should be online by the end of today. This project is led by Shoumik, Sangjin, and me. If you have general questions about protocols, packet formats, etc., you can talk to any GSI. But if you have more specific questions, please ask one of us.

Datacenters (part 2) CS 168, Fall 2014 Sylvia Ratnasamy http://inst.eecs.berkeley.edu/~cs168/

What you need to know Characteristics of a datacenter environment goals, constraints, workloads, etc. How and why DC networks are different (vs. WAN) e.g., latency, geo, autonomy, … How traditional solutions fare in this environment E.g., IP, Ethernet, TCP, DHCP, ARP Specific design approaches

Recap: Last Lecture Key requirements High “bisection bandwidth” Low latency, even in the worst-case Large scale Low cost

Recap: High bisection BW topology E.g., `Clos’ topology Multi-stage network All switches have k ports k/2 ports up, k/2 down All links have same speed “scale out” approach

E.g., “Fat Tree” Topology [Sigcomm’08] Few things to notice: 3 stages (can think of as edge / aggregation / core) All switches have same number of ports (=4) [hence k^3/4=16 end hosts] All links have same speed Many paths

Questions for today L2/L3 design: L4 design: addressing / routing / forwarding in the Fat-Tree L4 design: Transport protocol design (w/ Fat-Tree)

Using multiple paths well

Questions for today L2/L3 design: Goals: addressing / routing / forwarding in the Fat-Tree Goals: Routing protocol must expose all available paths Forwarding must spread traffic evenly over all paths

Extend DV / LS ? Routing Distance-Vector: Remember all next-hops that advertise equal cost to a destination Link-State: Extend Dijkstra’s to compute all equal cost shortest paths to each destination Forwarding: how to spread traffic across next hops?

Forwarding Per-packet load balancing to D from A (to D) from B (to D) from C (to D) Per-packet load balancing

Forwarding Per-packet load balancing to D to D to D from A (to D) from B (to D) from C (to D) Per-packet load balancing + traffic well spread (even w/ elephant flows) − Interacts badly w/ TCP

TCP w/ per-packet load balancing Consider: sender sends seq#: 1,2,3,4,5 receiver receives: 5,4,3,2,1 sender will enter fast retx, reduce CWND, retx #1, … repeatedly! Also: one RTT and timeout estimator for multiple paths Also: CWND halved when a packet is dropped on any path Multipath TCP (MP-TCP) is an ongoing effort to extend TCP to coexist with multipath routing Value beyond datacenters (e.g., spread traffic across wifi and 4G access)

Forwarding Per-flow load balancing (ECMP, “Equal Cost Multi Path”) to D to D to D from A (to D) from B (to D) from C (to D) Per-flow load balancing (ECMP, “Equal Cost Multi Path”) E.g., based on HASH(source-addr, dst-addr, source-port, dst-port)

Forwarding Per-flow load balancing (ECMP, “Equal Cost Multi Path”) to D to D to D from A (to D) from B (to D) from C (to D) Per-flow load balancing (ECMP, “Equal Cost Multi Path”) E.g., based on HASH(source-addr, dst-addr, source-port, dst-port) Pro: a flow follows a single path ( TCP is happy) Con: non optimal load-balancing; elephants are a problem

Extend DV / LS ? Benefits How: Problem: scales poorly Simple extensions to DV/LS ECMP for load balancing Benefits + Simple; reuses existing solutions Problem: scales poorly With N destinations, O(N) routing entries and messages N now in the millions!

You: design a scalable routing solution Design for this specific topology Take 5 minutes, then tell me: #routing entries per switch your solution needs many solutions possible; remember: you’re now free from backward compatibility (can redesign IP, routing algorithms, switch designs, etc.)

Solution 1: Topology-aware addressing Few things to notice: 3 stages (can think of as edge / aggregation / core) All switches have same number of ports (=4) [hence k^3/4=16 end hosts] All links have same speed Many paths 10.3.*.* 10.0.*.* 10.1.*.* 10.2.*.*

Solution 1: Topology-aware addressing Few things to notice: 3 stages (can think of as edge / aggregation / core) All switches have same number of ports (=4) [hence k^3/4=16 end hosts] All links have same speed Many paths 10.0.0.* 10.0.1.* 10.1.0.* 10.1.1.* 10.2.0.* 10.2.1.* 10.3.0.* 10.3.1.*

Solution 1: Topology-aware addressing Few things to notice: 3 stages (can think of as edge / aggregation / core) All switches have same number of ports (=4) [hence k^3/4=16 end hosts] All links have same speed Many paths 10.0.0.0 10.0.0.1 10.0.1.0 10.0.1.1 10.1.0.0 10.1.0.1 10.2.0.0 10.2.0.1 10.3.0.0 10.3.0.1 10.3.1.1 10.1.1.0 10.1.1.1 10.2.1.0 10.2.1.1 10.3.1.0

Solution 1: Topology-aware addressing 10.0.*.*  1 10.1.*.*  2 10.2.*.*  3 10.3.*.*  4 4 1 2 3 Few things to notice: 3 stages (can think of as edge / aggregation / core) All switches have same number of ports (=4) [hence k^3/4=16 end hosts] All links have same speed Many paths 10.0.0.0 10.0.0.1 10.0.1.0 10.0.1.1 10.1.0.0 10.1.0.1 10.2.0.0 10.2.0.1 10.1.1.1 10.2.1.0 10.2.1.1 10.3.0.0 10.3.0.1 10.3.1.0 10.3.1.1 10.1.1.0

Solution 1: Topology-aware addressing 10.0.0.*  1 10.0.1.*  2 *.*.*.*  3, 4 4 3 1 2 Few things to notice: 3 stages (can think of as edge / aggregation / core) All switches have same number of ports (=4) [hence k^3/4=16 end hosts] All links have same speed Many paths 10.0.0.0 10.0.0.1 10.0.1.0 10.0.1.1 10.1.0.0 10.1.0.1 10.2.0.0 10.2.0.1 10.1.1.1 10.2.1.0 10.2.1.1 10.3.0.0 10.3.0.1 10.3.1.0 10.3.1.1 10.1.1.0

Solution 1: Topology-aware addressing 10.0.0.*  1 10.0.1.*  2 *.*.*.0  3 *.*.*.1  4 4 3 1 2 Few things to notice: 3 stages (can think of as edge / aggregation / core) All switches have same number of ports (=4) [hence k^3/4=16 end hosts] All links have same speed Many paths 10.0.0.0 10.0.0.1 10.0.1.0 10.0.1.1 10.1.0.0 10.1.0.1 10.2.0.0 10.2.0.1 10.1.1.1 10.2.1.0 10.2.1.1 10.3.0.0 10.3.0.1 10.3.1.0 10.3.1.1 10.1.1.0

Solution 1: Topology-aware addressing 10.0.0.0  1 10.0.0.1  2 *.*.*.0  3 *.*.*.1  4 4 Few things to notice: 3 stages (can think of as edge / aggregation / core) All switches have same number of ports (=4) [hence k^3/4=16 end hosts] All links have same speed Many paths 3 1 2 10.0.0.0 10.0.0.1 10.0.1.0 10.0.1.1 10.1.0.0 10.1.0.1 10.2.0.0 10.2.0.1 10.1.1.1 10.2.1.0 10.2.1.1 10.3.0.0 10.3.0.1 10.3.1.0 10.3.1.1 10.1.1.0

Solution 1: Topology-aware addressing Idea: addresses embed location in regular topology Maximum #entries/switch: k ( = 4 in example) Constant, independent of #destinations! No route computation / messages / protocols Topology is “hard coded” (but still need localized link failure detection) Problems? VM migration: ideally, VM keeps its IP address when it moves Vulnerable to misconfiguration (in topology or addresses) Few things to notice: 3 stages (can think of as edge / aggregation / core) All switches have same number of ports (=4) [hence k^3/4=16 end hosts] All links have same speed Many paths

Solution 2: Centralize + Source Routes Centralized “controller” server knows topology and computes routes Controller hands server all paths to each destination O(#destinations) state per server But server memory cheap (e.g., 1M routes x 100B/route=100MB) Server inserts entire path vector into packet header (“source routing”) E.g., header=[dst=D | index=0 | path={S5,S1,S2,S9}] Switch forwards based on packet header index++; next-hop = path[index] Few things to notice: 3 stages (can think of as edge / aggregation / core) All switches have same number of ports (=4) [hence k^3/4=16 end hosts] All links have same speed Many paths

Solution 2: Centralize + Source Routes #entries per switch? None! #routing messages? akin to a broadcast from controller to all servers Pro: switches very simple and scalable flexibility: end-points (hence apps) control route selection Cons: scalability / robustness of controller (SDN addresses this) Clean-slate design of everything Few things to notice: 3 stages (can think of as edge / aggregation / core) All switches have same number of ports (=4) [hence k^3/4=16 end hosts] All links have same speed Many paths

Questions for today L2/L3 design: L4 design: addressing / routing / forwarding in the Fat-Tree L4 design: Transport protocol design (w/ Fat-Tree) Many slides courtesy of Mohammad Alizadeh, Stanford University

Workloads (Query) Partition/Aggregate Short messages [50KB-1MB] (Coordination, Control state) Large flows [1MB-50MB] (Data update) Delay-sensitive Throughput-sensitive Replace PDF Slide from Mohammad Alizadeh, Stanford University

Tension Between Requirements High Throughput vs. Low Latency Deep queues at switches: Queuing delays increase latency Shallow queues at switches: Bad for bursts & throughput DCTCP Deep Buffers – bad for latency Shallow Buffers – bad for bursts & throughput Reduce RTOmin – no good for latency AQM – Difficult to tune, not fast enough for incast-style micro-bursts, lose throughput in low stat-mux Objective: Low Queue Occupancy & High Throughput Slide from Mohammad Alizadeh, Stanford University

Data Center TCP (DC-TCP) Proposal from Microsoft Research, 2010 Incremental fixes to TCP for DC environments Deployed in Microsoft datacenters (~rumor) Leverages Explicit Congestion Notification (ECN)

Lec#14: Explicit Congestion Notification (ECN) Defined in RFC 3168 using ToS/DSCP bits in the IP header Single bit in packet header; set by congested routers If data packet has bit set, then ACK has ECN bit set Routers typically set ECN bit based on average queue length Congestion semantics exactly like that of drop I.e., sender reacts as though it saw a drop

Review: The TCP/ECN Control Loop Sender 1 ECN = Explicit Congestion Notification ECN Mark (1 bit) Receiver DCTCP is based on the existing Explicit Congestion Notification framework in TCP. Sender 2 Slide from Mohammad Alizadeh, Stanford University

DC-TCP: key ideas React early and quickly: use ECN React in proportion to the extent of congestion, not its presence ECN Marks TCP DCTCP 1 0 1 1 1 1 0 1 1 1 Cut window by 50% Cut window by 40% 0 0 0 0 0 0 0 0 0 1 Cut window by 5% Slide from Mohammad Alizadeh, Stanford University

At the switch If Queue Length > K (note: queue length is instantaneous, not average) Set ECN bit in packet K Don’t Mark Slide from Mohammad Alizadeh, Stanford University

At the sender Maintain running average of the fraction of packets marked (α). Window adapts based on α: α equal to 1 (high congestion): (same as TCP!) Slide from Mohammad Alizadeh, Stanford University

DCTCP in Action (Kbytes) Setup: Win 7, Broadcom 1Gbps Switch Not ns2. Setup: Win 7, Broadcom 1Gbps Switch Scenario: 2 long-lived flows, K = 30KB Slide from Mohammad Alizadeh, Stanford University

DC-TCP: why it works React early and quickly: use ECN  Avoid large buildup in queues  lower latency React in proportion to the extent of congestion, not its presence Maintain high throughput by not over-reacting to congestion Reduces variance in sending rates, lowering queue buildups Slide from Mohammad Alizadeh, Stanford University

Data Center TCP (DC-TCP) Proposal from Microsoft Research, 2010 Incremental fixes to TCP for DC environments Deployed in Microsoft datacenters (~rumor) Leverages Explicit Congestion Notification (ECN) An improvement; but still far from “ideal”

What’s “ideal” ? What is the best measure of performance for a data center transport protocol? When the flow is completely transferred? Latency of each packet in the flow? Number of packet drops? Link utilization? Average queue length at switches?

Flow Completion Time (FCT) Time from when flow started at the sender, to when all packets in the flow were received at the receiver

FCT with DCTCP Slide from Mohammad Alizadeh, Stanford University

Recall: “Elephants” and “Mice” Web search, data mining (Microsoft) [Alizadeh 2010] >10MB <100KB 55% of flows 3% of bytes 5% of flows 35% of bytes Slide from Mohammad Alizadeh, Stanford University

FCT with DCTCP Problem: the mice are delayed by the elephants Slide from Mohammad Alizadeh, Stanford University

Solution: use priorities! [pFabric, Sigcomm 2013] Packets carry a single priority number priority = remaining flow size (e.g., #bytes un-acked) Switches very small queues (e.g., 10 packets) send highest priority / drop lowest priority packet Servers Transmit/retransmit aggressively (at full link rate) Drop transmission rate only under extreme loss (timeouts) Slide from Mohammad Alizadeh, Stanford University

FCT Slide from Mohammad Alizadeh, Stanford University

FCT Slide from Mohammad Alizadeh, Stanford University

Why does pFabric work? Consider problem of scheduling N jobs at a single queue/processor J1, J2, … , Jn with duration T1, T2, …Tn respectively “Shortest Job First” (SJF) scheduling minimizes average Job Completion Time Pick job with minimum Ti ; de-queue and run ; repeat I.e., job that requires minimum runtime has max priority Solution for a network of queues is NP-hard Setting priority=remaining flow size is a heuristic to approximate SJF

DIY exercise: why does SJF work? Consider 4 jobs with duration 1,2,4,8 Compute finish times w/ Round-Robin vs. SJF Assume scheduler: Picks best job (as per scheduling policy) Runs job for one time unit Update job durations repeat Job completion times: With RR: 1, 5, 10, 15 With SJF: 1, 3, 7, 15

Summary Learn more: Next time: Guest lecture by Stephen Strowes, IPv6! https://code.facebook.com/posts/360346274145943/introducing-data-center-fabric-the-next-generation-facebook-data-center-network/ Next time: Guest lecture by Stephen Strowes, IPv6! -- incast? -- one big switch abstraction…