IXP C Programming Language

Slides:



Advertisements
Similar presentations
© 2004 Wayne Wolf Topics Task-level partitioning. Hardware/software partitioning.  Bus-based systems.
Advertisements

Approximating the Worst-Case Execution Time of Soft Real-time Applications Matteo Corti.
Semantics Static semantics Dynamic semantics attribute grammars
Lecture 6 Programming the TMS320C6x Family of DSPs.
Multi-dimensional Packet Classification on FPGA: 100Gbps and Beyond
Analysis of Algorithms CS Data Structures Section 2.6.
High Level Languages: A Comparison By Joel Best. 2 Sources The Challenges of Synthesizing Hardware from C-Like Languages  by Stephen A. Edwards High-Level.
2 nd Microsoft Rotor Workshop, Pisa, April 23-25, SCOOPLI for.NET: a library for concurrent object-oriented programming Volkan Arslan, Piotr Nienaltowski.
Ch. 7 Process Synchronization (1/2) I Background F Producer - Consumer process :  Compiler, Assembler, Loader, · · · · · · F Bounded buffer.
Constraint Systems used in Worst-Case Execution Time Analysis Andreas Ermedahl Dept. of Information Technology Uppsala University.
400 Gb/s Programmable Packet Parsing on a Single FPGA Authors : Michael Attig 、 Gordon Brebner Publisher: 2011 Seventh ACM/IEEE Symposium on Architectures.
SSP Re-hosting System Development: CLBM Overview and Module Recognition SSP Team Department of ECE Stevens Institute of Technology Presented by Hongbing.
1 A Tree Based Router Search Engine Architecture With Single Port Memories Author: Baboescu, F.Baboescu, F. Tullsen, D.M. Rosu, G. Singh, S. Tullsen, D.M.Rosu,
ECE 526 – Network Processing Systems Design IXP XScale and Microengines Chapter 18 & 19: D. E. Comer.
From Concept to Silicon How an idea becomes a part of a new chip at ATI Richard Huddy ATI Research.
Lecture 29 Fall 2006 Lecture 29: Parallel Programming Overview.
Xinming Chen, Zhen Chen, Beipeng Mu, Lingyun Ruan, Jinli Meng Towards High-performance IPsec on Cavium OCTEON Platform Research Institute of Information.
Secure Web Applications via Automatic Partitioning Stephen Chong, Jed Liu, Andrew C. Meyers, Xin Qi, K. Vikram, Lantian Zheng, Xin Zheng. Cornell University.
PARALLEL TABLE LOOKUP FOR NEXT GENERATION INTERNET
Context Tailoring the DBMS –To support particular applications Beyond alphanumerical data Beyond retrieve + process –To support particular hardware New.
Unit III : Introduction To Data Structures and Analysis Of Algorithm 10/8/ Objective : 1.To understand primitive storage structures and types 2.To.
(TPDS) A Scalable and Modular Architecture for High-Performance Packet Classification Authors: Thilan Ganegedara, Weirong Jiang, and Viktor K. Prasanna.
ISBN Chapter 3 Describing Semantics -Attribute Grammars -Dynamic Semantics.
Analysis of Algorithms CSCI Previous Evaluations of Programs Correctness – does the algorithm do what it is supposed to do? Generality – does it.
Property of Jack Wilson, Cerritos College1 CIS Computer Programming Logic Programming Concepts Overview prepared by Jack Wilson Cerritos College.
Chapter 18: Searching and Sorting Algorithms. Objectives In this chapter, you will: Learn the various search algorithms Implement sequential and binary.
Chapter 3 Part II Describing Syntax and Semantics.
Software Development Problem Analysis and Specification Design Implementation (Coding) Testing, Execution and Debugging Maintenance.
Performance Analysis of Packet Classification Algorithms on Network Processors Deepa Srinivasan, IBM Corporation Wu-chang Feng, Portland State University.
CSCI1600: Embedded and Real Time Software Lecture 33: Worst Case Execution Time Steven Reiss, Fall 2015.
تصميم وتحليل الخوارزميات عال311 Chapter 3 Growth of Functions
1 Overview of Programming Principles of Computers.
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
Author: Weirong Jiang and Viktor K. Prasanna Publisher: The 18th International Conference on Computer Communications and Networks (ICCCN 2009) Presenter:
ECE 526 – Network Processing Systems Design Programming Model Chapter 21: D. E. Comer.
Chapter 1: Preliminaries Lecture # 2. Chapter 1: Preliminaries Reasons for Studying Concepts of Programming Languages Programming Domains Language Evaluation.
Lecture 3COMPSCI.220.S1.T Running Time: Estimation Rules Running time is proportional to the most significant term in T(n) Once a problem size.
400 Gb/s Programmable Packet Parsing on a Single FPGA Author: Michael Attig 、 Gordon Brebner Publisher: ANCS 2011 Presenter: Chun-Sheng Hsueh Date: 2013/03/27.
NFV Compute Acceleration APIs and Evaluation
Planning & System installation
Support for Program Analysis as a First-Class Design Constraint in Legion Michael Bauer 02/22/17.
Ph.D. in Computer Science
Behavioral Style Combinational Design with VHDL
IAY 0600 Digital Systems Design
Python: Control Structures
OF COURSE I DON'T LOOK BUSY... I DID IT RIGHT THE FIRST TIME
Behavioral Style Combinational Design with VHDL
Courtsey & Copyright: DESIGN AND ANALYSIS OF ALGORITHMS Courtsey & Copyright:
Specifying Multithreaded Java semantics for Program Verification
Backus Naur form.
James D. Z. Ma Department of Electrical and Computer Engineering
Introduction to cosynthesis Rabi Mahapatra CSCE617
Pipelining and Vector Processing
CSCI1600: Embedded and Real Time Software
IAS 0600 Digital Systems Design
Modeling and Simulation of TTEthernet
Implementing an OpenFlow Switch on the NetFPGA platform
Jinquan Dai, Long Li, Bo Huang Intel China Software Center
IAS 0600 Digital Systems Design
Chapter 4 Action Routines.
Concurrency, Processes and Threads
Operating System , Fall 2000 EA101 W 9:00-10:00 F 9:00-11:00
Towards Effective Packet Classification
Using decision trees to improve signature-based intrusion detection
Hash Functions for Network Applications (II)
Andy Wang Operating Systems COP 4610 / CGS 5765
Duo Liu, Bei Hua, Xianghui Hu, and Xinan Tang
Author: Xianghui Hu, Xinan Tang, Bei Hua Lecturer: Bo Xu
CSCI1600: Embedded and Real Time Software
Research: Past, Present and Future
Presentation transcript:

IXP C Programming Language Yaxuan Qi NSLab, Tsinghua Dec 29, 2005 (NSLab Confidential)

Overview A C language implementation for IXP that provides standard, sequential C semantics. Extensions to support IXP architecture Hides multithreading of IXPs. Compiler makes full use of these features for high performance. Hides most of the low-level hardware resources: Hardware resources that provide specific acceleration functions are accessed through the intrinsic library.

Packet Processing Stage Programming model related PPS definition: pps pps_name(void) { } Pipe data type: pipe pipe_id;

Packet Processing Stage A PPS can be partitioned onto one or more MEs Determined by the compiler based on a performance specification and code size considerations. Mapping to MEs can take several forms: multi-processing, context pipelining or a combination of the two. Compiler is responsible for all communication and synchronization within a PPS

Sample Application Rx PPS Processing PPS Tx PPS Logical View Packet Rx process CSIX Tx Logical View Physical View Processing PPS Rx PPS Tx PPS … ... ME2 Scratch Scratch ME0 ME1 NN ME14 ME15 NN 2-way Context Pipelining 8-way Multithreading 2-way Context Pipelining 8-way Multithreading N-way Multiprocessing 8-way Multithreading

Critical Path Path annotation Specified as a statement with the following syntax __path(path_id); Used to identify a critical path through the PPS loop body Programmer selects certain points as belonging to a critical path, compiler infers what other points lie on the critical path

We can limit the cycles for this path Critical Path We can limit the cycles for this path

LOOP Count Loop count directive Specified by the directive __loopcount(n) where n is an integer constant that specifies the number of times an inner loop is executed Used by the compiler to estimate the contribution of a loop to the length of a critical path Syntax is as follows: for __loopcount(n) { } while __loopcount (n) { } do __loopcount (n) { }

LOOP Count

Performance Specification

Experiments IXP2xxx Packet Processing Stages SPI4 Packet Rx Packet Tx Ethernet Decap Range Matching IPv4 Forwarding Queue Managing Packet Tx Scheduling SPI4 CSIX Packet Processing Stages of the Packet Classification Application. Packet classification algorithms are running in Rage Matching PPS.

Experiments Simulation Result: Linear Search Performance Evaluation of Linear Search Algorithm. Each incoming packet just matches the default rule, so that the worst-case performance is obtained. Deterministic worst-case bound: O(N).

Experiments Simulation Result: HSM Performance Evaluation of HSM Algorithm. Deterministic worst-case bound: O(logN).

Experiments Simulation Result: HiCuts (worst-case path) Performance Evaluation of HiCuts Algorithm. Non-deterministic worst-case bound. 1k rules often need a 10-level decision tree.

Experiments Simulation Result: HiCuts (worst-case path) And what’s more, in the worst-case, it often needs up to 10 times of linear searches after tracing down the decision tree.

Thanks (NSLab Confidential)