Authors: Matteo Varvello, Diego Perino, and Leonardo Linguaglossa Publisher: NOMEN 2013 (The 2nd IEEE International Workshop on Emerging Design Choices.

Slides:



Advertisements
Similar presentations
IP Router Architectures. Outline Basic IP Router Functionalities IP Router Architectures.
Advertisements

M AINTAINING L ARGE A ND F AST S TREAMING I NDEXES O N F LASH Aditya Akella, UW-Madison First GENI Measurement Workshop Joint work with Ashok Anand, Steven.
A Scalable and Reconfigurable Search Memory Substrate for High Throughput Packet Processing Sangyeun Cho and Rami Melhem Dept. of Computer Science University.
Bio Michel Hanna M.S. in E.E., Cairo University, Egypt B.S. in E.E., Cairo University at Fayoum, Egypt Currently is a Ph.D. Student in Computer Engineering.
Authors: Alexander Afanasyev, Priya Mahadevany, Ilya Moiseenko, Ersin Uzuny, Lixia Zhang Publisher: IFIP Networking, 2013 (International Federation for.
Fast Buffer Memory with Deterministic Packet Departures Mayank Kabra, Siddhartha Saha, Bill Lin University of California, San Diego.
Exploiting Graphics Processors for High- performance IP Lookup in Software Routers Author: Jin Zhao, Xinya Zhang, Xin Wang, Yangdong Deng, Xiaoming Fu.
Router Architecture : Building high-performance routers Ian Pratt
1 Author: Ioannis Sourdis, Sri Harsha Katamaneni Publisher: IEEE ASAP,2011 Presenter: Jia-Wei Yo Date: 2011/11/16 Longest prefix Match and Updates in Range.
IP Address Lookup for Internet Routers Using Balanced Binary Search with Prefix Vector Author: Hyesook Lim, Hyeong-gee Kim, Changhoon Publisher: IEEE TRANSACTIONS.
What's inside a router? We have yet to consider the switching function of a router - the actual transfer of datagrams from a router's incoming links to.
1 A Tree Based Router Search Engine Architecture With Single Port Memories Author: Baboescu, F.Baboescu, F. Tullsen, D.M. Rosu, G. Singh, S. Tullsen, D.M.Rosu,
Efficient IP-Address Lookup with a Shared Forwarding Table for Multiple Virtual Routers Author: Jing Fu, Jennifer Rexford Publisher: ACM CoNEXT 2008 Presenter:
Sets and Maps Chapter 9. Chapter 9: Sets and Maps2 Chapter Objectives To understand the Java Map and Set interfaces and how to use them To learn about.
1 Lecture 2: Snooping and Directory Protocols Topics: Snooping wrap-up and directory implementations.
Hash Tables1 Part E Hash Tables  
Hash Tables1 Part E Hash Tables  
1 A Fast IP Lookup Scheme for Longest-Matching Prefix Authors: Lih-Chyau Wuu, Shou-Yu Pin Reporter: Chen-Nien Tsai.
1 Performing packet content inspection by longest prefix matching technology Authors: Nen-Fu Huang, Yen-Ming Chu, Yen-Min Wu and Chia- Wen Ho Publisher:
Fast binary and multiway prefix searches for pachet forwarding Author: Yeim-Kuan Chang Publisher: COMPUTER NETWORKS, Volume 51, Issue 3, pp , February.
BUFFALO: Bloom Filter Forwarding Architecture for Large Organizations Minlan Yu Princeton University Joint work with Alex Fabrikant,
Hash, Don’t Cache: Fast Packet Forwarding for Enterprise Edge Routers Minlan Yu Princeton University Joint work with Jennifer.
Router Architectures An overview of router architectures.
OpenFlow Switch Limitations. Background: Current Applications Traffic Engineering application (performance) – Fine grained rules and short time scales.
Computer Networks Switching Professor Hui Zhang
1 Efficient packet classification using TCAMs Authors: Derek Pao, Yiu Keung Li and Peng Zhou Publisher: Computer Networks 2006 Present: Chen-Yu Lin Date:
Paper Review Building a Robust Software-based Router Using Network Processors.
Sarang Dharmapurikar With contributions from : Praveen Krishnamurthy,
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
Fast and deterministic hash table lookup using discriminative bloom filters  Author: Kun Huang, Gaogang Xie,  Publisher: 2013 ELSEVIER Journal of Network.
MIDeA :A Multi-Parallel Instrusion Detection Architecture Author: Giorgos Vasiliadis, Michalis Polychronakis,Sotiris Ioannidis Publisher: CCS’11, October.
Scalable Name Lookup in NDN Using Effective Name Component Encoding
Author: Haoyu Song, Fang Hao, Murali Kodialam, T.V. Lakshman Publisher: IEEE INFOCOM 2009 Presenter: Chin-Chung Pan Date: 2009/12/09.
Fast Packet Classification Using Bloom filters Authors: Sarang Dharmapurikar, Haoyu Song, Jonathan Turner, and John Lockwood Publisher: ANCS 2006 Present:
Authors: Haowei Yuan, Tian Song, and Patrick Crowley Publisher: ICCCN 2012 Presenter: Chai-Yi Chu Date: 2013/05/22 1.
Designing Packet Buffers for Internet Routers Friday, October 23, 2015 Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford.
Addressing Queuing Bottlenecks at High Speeds Sailesh Kumar Patrick Crowley Jonathan Turner.
Author: Sriram Ramabhadran, George Varghese Publisher: SIGMETRICS’03 Presenter: Yun-Yan Chang Date: 2010/12/29 1.
Towards a Billion Routing Lookups per Second in Software  Author: Marko Zec, Luigi, Rizzo Miljenko Mikuc  Publisher: SIGCOMM Computer Communication Review,
Review of the literature : DMND:Collecting Data from Mobiles Using Named Data Takashima Daiki Park Lab, Waseda University, Japan 1/15.
Parallelization and Characterization of Pattern Matching using GPUs Author: Giorgos Vasiliadis 、 Michalis Polychronakis 、 Sotiris Ioannidis Publisher:
Pending Interest Table Sizing in Named Data Networking Luca Muscariello Orange Labs Networks / IRT SystemX G. Carofiglio (Cisco), M. Gallo, D. Perino (Bell.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Measurement COS 597E: Software Defined Networking.
Compact Trie Forest: Scalable architecture for IP Lookup on FPGAs Author: O˘guzhan Erdem, Aydin Carus and Hoang Le Publisher: ReConFig 2012 Presenter:
1 Power-Efficient TCAM Partitioning for IP Lookups with Incremental Updates Author: Yeim-Kuan Chang Publisher: ICOIN 2005 Presenter: Po Ting Huang Date:
1. Outline  Introduction  Different Mechanisms Broadcasting Multicasting Forward Pointers Home-based approach Distributed Hash Tables Hierarchical approaches.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Virtual Memory Hardware.
Author : Sarang Dharmapurikar, John Lockwood Publisher : IEEE Journal on Selected Areas in Communications, 2006 Presenter : Jo-Ning Yu Date : 2010/12/29.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Tracking Millions of Flows In High Speed Networks for Application Identification Tian Pan, Xiaoyu Guo, Chenhui Zhang, Junchen Jiang, Hao Wu and Bin Liut.
1 ECE 526 – Network Processing Systems Design System Implementation Principles I Varghese Chapter 3.
Parallel tree search: An algorithmic approach for multi- field packet classification Authors: Derek Pao and Cutson Liu. Publisher: Computer communications.
Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas.
Sets and Maps Chapter 9. Chapter Objectives  To understand the Java Map and Set interfaces and how to use them  To learn about hash coding and its use.
ECE 456 Computer Architecture Lecture #9 – Input/Output Instructor: Dr. Honggang Wang Fall 2013.
Hierarchical packet classification using a Bloom filter and rule-priority tries Source : Computer Communications Authors : A. G. Alagu Priya 、 Hyesook.
Author : Masanori Bando and H. Jonathan Chao Publisher : INFOCOM, 2010 Presenter : Jo-Ning Yu Date : 2011/02/16.
Ofir Luzon Supervisor: Prof. Michael Segal Longest Prefix Match For IP Lookup.
Range Hash for Regular Expression Pre-Filtering Publisher : ANCS’ 10 Author : Masanori Bando, N. Sertac Artan, Rihua Wei, Xiangyi Guo and H. Jonathan Chao.
BUFFALO: Bloom Filter Forwarding Architecture for Large Organizations Minlan Yu Princeton University Joint work with Alex Fabrikant,
1 Lecture 8: Snooping and Directory Protocols Topics: 4/5-state snooping protocols, split-transaction implementation details, directory implementations.
Exploiting Graphics Processors for High-performance IP Lookup in Software Routers Jin Zhao, Xinya Zhang, Xin Wang, Yangdong Deng, Xiaoming Fu IEEE INFOCOM.
Lecture: Large Caches, Virtual Memory
IP Routers – internal view
Addressing: Router Design
Statistical Optimal Hash-based Longest Prefix Match
5.2 FLAT NAMING.
Implementing an OpenFlow Switch on the NetFPGA platform
Packet Classification Using Coarse-Grained Tuple Spaces
CS210- Lecture 16 July 11, 2005 Agenda Maps and Dictionaries Map ADT
Presentation transcript:

Authors: Matteo Varvello, Diego Perino, and Leonardo Linguaglossa Publisher: NOMEN 2013 (The 2nd IEEE International Workshop on Emerging Design Choices in Name- Oriented Networking) Presenter: Chia-Yi Chu Date: 2013/09/11

 Introduction  Name Data Networking  Design Space  Evaluation 2

 Information-Centric Networking (ICN), a novel networking approach where information (or content) replace end-hosts as communication entities.  Named Data Networking (NDN) is one of the most popular ICN designs. 3

 Pending Interest Table (PIT) ◦ keeps track of what content is requested and from which line- card’s interface. ◦ An efficient design of the PIT is thus key to enable NDN (or ICN) at wire speed. ◦ design consists of two aspects:  placement  refers to where in a router the PIT should be implemented  data structure.  how PIT entries should be stored and organized to enable efficient operations. 4

 PIT ◦ keeps track of the interfaces from where chunks have recently been requested and yet not served. ◦ tuple  content_name  list_interfaces  list_nonces  expiration 5

 Three operations can be performed on the PIT ◦ Insert  when a new Interest is received ◦ Update  The new interface is added to list interface ◦ Delete  when an entry in the PIT expires  when a Data is received 6

 Frequent operations ◦ Assume a wire speed of 40 Gbps, Interest packet with a size of 80 bytes and Data packet with a size of 1,500 bytes. ◦ A load equal to 100% where no Data is available in response of the Interests  60 Million operations per second ◦ flow balance or a load of 50%  6 Million operations per second. 7

 Deterministic operation time ◦ Multiple packets are processed in parallel to hide memory access time and increase throughput ◦ non deterministic operation time would require input queues to buffer packets, causing processes to idle while waiting for others to terminate. 8

 Matching algorithm ◦ Assume PIT’s lookup is performed using exact matching and not LPM. 9

 Timer support ◦ PIT’s entries are deleted after a timeout to avoid the PIT’s size to explode over time. 10

 Potential large state ◦ The PIT size can be estimated as λ*T  λ refers to the wire speed.  T is the time each entry lives in the PIT. ◦ Assume that PIT entries would not last for more than 80 ms. ◦ PIT would contain no more than about 250,000 entries even when λ = 40 Gbps. ◦ Worst case: 500 ms. and 1 sec. ◦ PIT can contain between 30 and 60 Million entries. 11

 Distributed design ◦ each line-card deploys its own PIT. ◦ Moving from a central PIT to multiple decentralized PITs  Hard to maintain correct Interest aggregation, loop detection, and multipath support 12

 Input-only ◦ Assume a content router composed by N line-cards interconnected by a switch fabric. ◦ PIT should be placed at each input line-card. ◦ Interest creates a PIT entry only in the PIT of the line-card where it is received. ◦ Data returns at an output line-card, it is broadcasted to all input line-cards where a PIT lookup to check whether the Data should be further forwarded or not 13

 Input-only ◦ enables multipath, but it lacks loop detection and correct Interest aggregation  each PIT is only aware of local list interfaces and list nonces. ◦ N PIT lookups in presence of returning Data. 14

 Output-only ◦ PIT should be placed at each output line-card. ◦ Interest create a PIT entry at the output line-card where it is forwarded. ◦ limitations in case of multipath ◦ loops cannot be detected as each output PIT is only aware of the local list nonces. ◦ requires a FIB lookup per Interest 15

 Input-output ◦ PIT should be placed both in input and output. ◦ Interest creates a PIT entry both at the input line-card where it is received and at the output line-card where it should be forwarded. ◦ no unnecessary FIB lookups and duplicated packets in presence of multipath. ◦ loops cannot be detected. 16

 Third party ◦ PIT should be placed at each input line-card as in the input- only placement. ◦ selected as j = contentID mod N  N is the number of line-cards  contentID is the hash of the content name, or H(A). ◦ as Data is received, the output line-card identifies j by performing H(A) mod N ◦ enables both multipath and loop detection ◦ generates an additional switching operation 17

 Counting Bloom Filter (CBF) ◦ enables deletion using a counter per bit. ◦ only stores a footprint of each PIT’s entry  realizes great compression. ◦ drawback is false positives that generate wasted Data transmissions. ◦ Be coupled with the input-only placement  the compression of its entries loses the information contained in list interfaces. 18

19

 Hash-table ◦ content name is used as key and its corresponding PIT’s entry is used as a value. ◦ Be coupled with all placements. ◦ can detect loop and support timers. ◦ Need larger memory footprint than CBF 20

 Name prefix trie ◦ ordered tree used to store/retrieve values associated to “components”, set of characters separated by a delimiter. ◦ supports LPM, and exact matching as a subset of it. ◦ Encoded Name Prefix Trie (ENPT)  reduces the memory footprint of a name prefix trie by encoding each component to a 32-bits integer called “code”.  drawback is that this compression requires to introduce a hash- table to map codes to components.  does not specify how to detect loops and remove PIT entries with expired timers. 21

 Methods ◦ Assume PIT entry is bits.  1(content name) + 16(expiration) + 16(list_nonces) + 16(list_interfaces) ◦ Linear chaining hash-table (LHT)  32(CRC) + 32(pointer for chaining) = 113 bits. ◦ Open-addressed d-left hash-table (DHT)  d = 2  d sub-tables are accessed sequentially (DHT); parallel (DHTp)  32(CRC) (remove content name, use pointer to a table stores content name ) = 112 bits. ◦ Encoded Name Prefix Trie (ENPT)  Each component is encoded with 32 bits integer. 22

 Memory footprint ◦ To find out which memory should be used for each data structure. ◦ On-chip memory  SRAM(4.25 MB, access time of 1ns) ◦ Off-chip memory 23 TypeSize (MB)Access time (ns) SRAM254 RLDRAM25015 DRAM

24

 # of packets each solution can handle as a function of load. 25

26

 DHT-based ◦ No optimizations  Cavium Octeon Network processor (NP) ◦ Cavium Octeon Plus CN cores 600 MHz network processor equipped with 48KB of L1 cache per core, 2MB of shared memory, 4 GB of off-chip DRAM memory, and an SFP+ 10GbE.  Load = % at 10 Gbps, with a PIT size of 62K and 1M entries.  Interest packets of 80 bytes and Data packets of 1,500 bytes  Short content names (20 bytes) 27

 3 cores, load=50% → 1.5 Mpcks.  8 cores, load = 80%, → 3.4 Mpcks. 28