A First Example: The Bump in the Wire A First Example: The Bump in the Wire 8/9 - 2006 INF5062: Programming Asymmetric Multi-Core Processors.

Slides:



Advertisements
Similar presentations
Umut Girit  One of the core members of the Internet Protocol Suite, the set of network protocols used for the Internet. With UDP, computer.
Advertisements

IXP: Bump in the Wire IXP: Bump in the Wire INF5063: Programming Asymmetric Multi-Core Processors 15 April 2015.
A First Example: The Bump in the Wire A First Example: The Bump in the Wire 9/ INF5061: Multimedia data communication using network processors.
IXP: The Bump in the Wire IXP: The Bump in the Wire INF5062: Programming Asymmetric Multi-Core Processors 22 April 2015.
Introduction1-1 message segment datagram frame source application transport network link physical HtHt HnHn HlHl M HtHt HnHn M HtHt M M destination application.
Instructor: Sam Nanavaty TCP/IP protocol. Instructor: Sam Nanavaty Version – Allows for the evolution of the protocol IHL (Internet header length) – Length.
1 Chapter 3 TCP and IP. Chapter 3 TCP and IP 2 Introduction Transmission Control Protocol (TCP) Transmission Control Protocol (TCP) User Datagram Protocol.
CP476 Internet Computing TCP/IP 1 Lecture 3. TCP / IP Objective: A in-step look at TCP/IP Purposes and operations Header specifications Implementations.
CSEE W4140 Networking Laboratory Lecture 6: TCP and UDP Jong Yul Kim
CSCI 4550/8556 Computer Networks Comer, Chapter 20: IP Datagrams and Datagram Forwarding.
1 Application TCPUDP IPICMPARPRARP Physical network Application TCP/IP Protocol Suite.
Oct 19, 2004CS573: Network Protocols and Standards1 IP: Datagram and Addressing Network Protocols and Standards Autumn
Chapter 3 Review of Protocols And Packet Formats
ECE 526 – Network Processing Systems Design IXP XScale and Microengines Chapter 18 & 19: D. E. Comer.
Defining Network Protocols Application Protocols –Application Layer –Presentation Layer –Session Layer Transport Protocols –Transport Layer Network Protocols.
Memory INF5062: Programming Asymmetric Multi-Core Processors 1 August 2015.
Gursharan Singh Tatla Transport Layer 16-May
Chapter 4 Queuing, Datagrams, and Addressing
Microsoft Windows Server 2003 TCP/IP Protocols and Services Technical Reference Slide: 1 Lesson 12 Transmission Control Protocol (TCP) Basics.
Petrozavodsk State University, Alex Moschevikin, 2003NET TECHNOLOGIES Internet Control Message Protocol ICMP author -- J. Postel, September The purpose.
© Janice Regan, CMPT 128, CMPT 371 Data Communications and Networking Network Layer ICMP and fragmentation.
TCOM 515 IP Routing Lab Lecture 1. Class information Instructor: Wei Wu –Lecture and Lab session 2 – Instructor:
Introduction to Networks CS587x Lecture 1 Department of Computer Science Iowa State University.
10/13/20151 TCP/IP Transmission Control Protocol Internet Protocol.
The Saigon CTT Semester 1 CHAPTER 10 Le Chi Trung.
Fall 2005Computer Networks20-1 Chapter 20. Network Layer Protocols: ARP, IPv4, ICMPv4, IPv6, and ICMPv ARP 20.2 IP 20.3 ICMP 20.4 IPv6.
1 The Internet and Networked Multimedia. 2 Layering  Internet protocols are designed to work in layers, with each layer building on the facilities provided.
TCOM 515 IP Routing. Syllabus Objectives IP header IP addresses, classes and subnetting Routing tables Routing decisions Directly connected routes Static.
Chapter 4 Network Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012 A note on the use of these.
Hyung-Min Lee ©Networking Lab., 2001 Chapter 8 ARP and RARP.
1 IP : Internet Protocol Computer Network System Sirak Kaewjamnong.
Review the key networking concepts –TCP/IP reference model –Ethernet –Switched Ethernet –IP, ARP –TCP –DNS.
Memory Memory 10/ INF5060: Multimedia data communication using network processors.
Chapter 81 Internet Protocol (IP) Our greatest glory is not in never failing, but in rising up every time we fail. - Ralph Waldo Emerson.
Internetworking Internet: A network among networks, or a network of networks Allows accommodation of multiple network technologies Universal Service Routers.
Internetworking Internet: A network among networks, or a network of networks Allows accommodation of multiple network technologies Universal Service Routers.
1 Network Layer Lecture 16 Imran Ahmed University of Management & Technology.
Layer 3: Internet Protocol.  Content IP Address within the IP Header. IP Address Classes. Subnetting and Creating a Subnet. Network Layer and Path Determination.
Networks and Protocols CE Week 7b. Routing an Overview.
Jan 15, 2008CS573: Network Protocols and Standards1 The Internet Protocol: Related Protocols and Standards (IP datagram, addressing, ARP) Network Protocols.
Internet Protocol Formats. IP (V4) Packet byte 0 byte1 byte 2 byte 3 data... – up to 65 K including heading info Version IHL Serv. Type Total Length Identifcation.
1 Kyung Hee University Chapter 8 ARP(Address Resolution Protocol)
Internet Protocol Version 4 VersionHeader Length Type of Service Total Length IdentificationFragment Offset Time to LiveProtocolHeader Checksum Source.
1 CSE 5346 Spring Network Simulator Project.
1 Figure 3-5: IP Packet Total Length (16 bits) Identification (16 bits) Header Checksum (16 bits) Time to Live (8 bits) Flags Protocol (8 bits) 1=ICMP,
IP Protocol CSE TCP/IP Concepts Connectionless Operation Internetworking involves connectionless operation at the level of the Internet Protocol.
David M. Zar Block Design Review: PlanetLab Line Card Header Format.
Packet Switch Network Server client IP Ether IPTCPData.
© 2003, Cisco Systems, Inc. All rights reserved.
Instructor Materials Chapter 6: Network Layer
Dr. Richard Spillman Fall 2006
Introduction to TCP/IP
Instructor Materials Chapter 5: Ethernet
Chapter 8 ARP(Address Resolution Protocol)
Internet Protocol Formats
Reference Router on NetFPGA 1G
Chapter 6: Network Layer
Internet Protocol (IP)
IP : Internet Protocol Surasak Sanguanpong
Network Core and QoS.
Running A First Example: The Web Bumper
Net 323 D: Networks Protocols
Chapter 15. Internet Protocol
Chapter 4 Network Layer Computer Networking: A Top Down Approach 5th edition. Jim Kurose, Keith Ross Addison-Wesley, April Network Layer.
Internet Protocol Formats
Reference Router on NetFPGA 1G
Network Architecture Models: Layered Communications
Transport Layer 9/22/2019.
Network Core and QoS.
Presentation transcript:

A First Example: The Bump in the Wire A First Example: The Bump in the Wire 8/ INF5062: Programming Asymmetric Multi-Core Processors

Using IXP2400

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors Programming Model Packet flow illustration for IP forwarding RX microblock MAC and route lookup TX microblock XScale microengines input ports output ports scheduler queue manager ARP and route table mgr IP packet ARP

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors Programming Model RX microblock TX microblock XScale microengines input ports output ports scheduler queue manager Scratch rings head tail metadata queue data buffers linked list logical mapping

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors Programming Model RX microblock TX microblock XScale microengines input ports output ports scheduler queue manager Hardware contexts Threads

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors Framework uclo  Microengine loader  Necessary to load your microengine code into the microengines at runtime hal  Hardware abstraction layer  Mapping of physical memory into XScale processes’ virtual address space  Functions starting with hal ossl  Operating system service layer  Limited abstraction from hardware specifics  Functions starting with ix_ rm  Resource manager  Layered on top of uclo and ossl  Memory and resource management all memory types and their features IPC, counters, hash  Functions starting with ix_

Bump in the Wire

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors Bump in the Wire Internet Count web packets, count ICMP packets

Packet Headers and Encapsulation

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors Ethernet describes content of ethernet frame, e.g., 0x0800 indicates an IP datagram 48 bit address configured to an interface on the NIC on the receiver 48 bit address configured to an interface on the NIC on the sender | Destination Address | | | | | Source address | | Frame type |

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors Internet Protocol version 4 (IPv4) |Version| IHL |Type of Service| Total Length | | Identification |Flags| Fragment Offset | | Time to Live | Protocol | Header Checksum | | Source Address | | Destination Address | | Options | Padding | indicates the format of the internet header, i.e., version 4 length of the internet header in 32 bit words, and thus points to the beginning of the data (minimum value of 5) indication of the abstract parameters of the quality of service desired – somehow treat high precedence traffic as more important – tradeoff between low-delay, high-reliability, and high-throughput – NOT used, bits now reused datagram length (octets) including header and data - allows the length of 65,535 octets identifying value to aid assembly of fragments first zero, fragments allowed and last fragment indicate where this fragment belongs in datagram checksum on the header only – TCP, UDP over payload. Since some header fields change(TTL), this is recomputed and verified at each point indicates used transport layer protocol disable a packet to circulate forever,decrease value by at least 1 in each node – discarded if 0 32-bit address fields. May be configured differently from small to large networks options may extend the header – indicated by IHL. If the options do not end on a 32-bit boundary, the remaining fields are padded in the padding field (0’s)

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors Internet Control Message Protocol (ICMPv4) type-specific arbitrary length data Type of the control msg, including echo request (8) and echo reply (0) Checksum for the ICMP header only | Type | Code | header checksum | | data | Refinement | 8 | 0 | header checksum | | identifier | sequence number | | data | ICMP Echo Request Optional identifier, chosen by sender, echoed by receiver Optional sequence number, chosen by sender, echoed by receiver

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors UDP specifies the total length of the UDP datagram in octets contains a 1’s complement checksum over UDP packet and an IP pseudo header with source and destination address port to identify the sending application port to identify receiving application | Source Port | Destination Port | | Length | Checksum |

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors TCP port to identify the sending application port to identify receiving application contains a 1’s complement checksum over UDP packet and an IP pseudo header with source and destination address sequence number for data in payload acknowledgement for data received options may extend the header. If the options do not end on a 32-bit boundary, the remaining fields are padded in the padding field (0’s) header length in 32 bit units pointer to urgent data in segment code bits: urgent, ack, push, reset, syn, fin | Source Port | Destination Port | | Sequence Number | | Acknowledgment Number | | Header| |U|A|P|R|S|F| | | length| Reserved |R|C|S|S|Y|I| Window | | | |G|K|H|T|N|N| | | Checksum | Urgent Pointer | | Options | Padding | receiver’s buffer size for additional data

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors Encapsulation

Lab Setup

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors Lab Setup IXP lab switch Internet Student labswitch … Error Reboot switch “make reset” ssh connection

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors Lab Setup - Addresses IXP lab switch …

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors IXP lab switch … IXP lab Lab Setup - Addresses … switch … …

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors Lab Setup – Data Path IXP lab switch … IXP2400 PCI IO hub memory hub CPU memory hub interface system bus RAM interface web bumper (counting web packets and forwarding all packets from one interface to another) SSH connection to IFI:

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors Lab Setup – Data Path IXP lab switch … IXP2400 IO hub memory hub CPU memory web bumper (counting web packets and forwarding all packets from one interface to another) SSH connection to IFI: SSH connection to IFI:

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors IXP 2400 The Web Bumper web bumper (counting web packets and forwarding all packets from one interface to another) web bumper wwbump (core) output port input port XScale microengines rx block processing encompasses all operations performed as packets arrive tx block processing encompasses all operations applied as packets depart rx microblock tx microblock wwbump (microblock) the wwbump core components checks a packet forwarded by the wwbump microblock count ping packet - add 1 to icmp counter send back to wwbump microblock the wwbump microblock checks all packets from rx block: if it is a ping or web packet: if web packet, add 1 to web counter and forward to tx block if ping packet, forward to wwbump core component if neither, forward to tx block The wwbump microblock forwards all packets from the wwbump core component to the tx block

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors Starting and Stopping On the host machine  Location of the example: /root/ixa/wwpingbump  Rebooting the IXP card: make reset  Installing the example: make install  Telnet to the card: telnet minicom On the card  To start the example:./wwbump  To stop the example: CTRL-C

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors Identifying Web Packets These are the header fields you need for the web bumper: Ethernet type 0x800 IP type 6 TCP port 80

Memory

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors Intel IXP2400 Hardware Reference Manual

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors 4 GB address space (32 bit pointers) XScale Memory Mapping SDRAM (up to 2GB) Mapped at boot time: Flash ROM SRAM (up to 1GB) Other (up to ½GB) PCI Memory (up to ½GB) 0x x xC xE xFFFF FFFF 0xDFFF FFFF 0xBFFF FFFF 0x7FFF FFFF XScale Local CSRs (32MB) SRAM CSRs & Queue Array (64MB) Scratch (32MB) DRAM CSRs reserved PCI controller CSRs PCI config registers PCI Spec/IACK PCI CFG PCI I/O MSF Flash ROM reserved CAP-CSRs Available on our cards: 256 MB SDRAM 64 MB SRAM 16 KB Scratch

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors 4 GB address space (32 bit pointers) XScale Memory Mapping SDRAM (up to 2GB) Mapped at boot time: Flash ROM SRAM (up to 1GB) Other (up to ½GB) PCI Memory (up to ½GB) 0x x xC xE xFFFF FFFF 0xDFFF FFFF 0xBFFF FFFF 0x7FFF FFFF Read, Write, Swap 0x Bit Set, Bit Test & Set 0x Bit Clear, Bit Test & Clear 0x Add, Test and Add 0x8C reserved Read, Write, Swap 0x Bit Set, Bit Test & Set 0x Bit Clear, Bit Test & Clear 0x Add, Test and Add 0x9C Get, Put 0xCE xCE Get, Put Deq, Enq 0xCC xCC Deq, Enq SRAM CSR 0xCC xCC SRAM CSR Channel 0 same memory mapped four times implicit features no more than 128 kB per channel Channel 1

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors Microengine Memory Mapping SDRAM (up to 2GB) Mapped at boot time: Flash ROM Other (up to ½GB) PCI Memory (up to ½GB) 0x x xC xE xFFFF FFFF 0xDFFF FFFF 0xBFFF FFFF 0x7FFF FFFF XScale Local CSRs (32MB) SRAM CSRs & Queue Array (64MB) Scratch (32MB) DRAM CSRs reserved PCI controller CSRs PCI config registers PCI Spec/IACK PCI CFG PCI I/O MSF Flash ROM reserved CAP-CSRs Read, Write, Swap 0x Bit Set, Bit Test & Set 0x Bit Clear, Bit Test & Clear 0x Add, Test and Add 0x8C Read, Write, Swap 0x Bit Set, Bit Test & Set 0x Bit Clear, Bit Test & Clear 0x Add, Test and Add 0x9C reserved SDRAM 0x Scratch SRAM Channel 0 SRAM Channel 1 0x

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors XScale Memory A general purpose processor  With MMU In use!  32 Kbytes instruction cache Round robin replacement  32 Kbytes data cache Round robin replacement Write-back cache, cache replacement on read, not on write  2 Kbytes mini-cache for data that is used once and then discarded To reduce flushing of the main data cache  Instruction code stored in SDRAM

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors Microengine Memory 256 general purpose registers  Arranged in two banks 512 transfer registers  Transfer registers are not general purpose registers  DRAM transfer registers Transfer in Transfer out  SRAM transfer registers Transfer in Transfer out  Push and pull on transfer registers usually by external units 128 next neighbor registers  New in ME V2  Dedicated data path to neighboring ME  Also usable inside a ME  SDK use: message forwarding using rings 2560 bytes local memory  New in ME V2  RAM  Quad-aligned  Shared by all contexts  SDK use: register spill in code generated from MicroC

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors SDRAM Recommended use  XScale instruction code  Large data structures  Packets during processing 64-bit addressed (8 byte aligned, quadword aligned) Up to 2GB  Our cards have 256 MB  Unused higher addresses map onto lower addresses! 2.4 Gbps peak bandwidth  Higher bandwidth than SRAM  Higher latency than SRAM Access  Instruction from external devices are queued and scheduled  Accessed by XScale Microengines PCI

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors SRAM Recommended use  Lookup tables  Free buffer lists  Data buffer queue lists 32-bit addressed (4 byte aligned, word aligned) Up to 16 MB  Distributed over 4 channels  Our cards have 8 MB, use 2 channels 1.6 Gbps peak bandwidth  Lower bandwidth than SDRAM  Lower latency than SDRAM Access  XScale  Microengines Accessing SRAM  XScale access Byte, word and longword access  Microengine access Bit and longword access only

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors SRAM Special Features Atomic bit set and clear with/without test Atomic increment/decrement Atomic add and swap Atomic enqueue, enqueue_tail, dequeue  Hardware support for maintaining queues  Combination enqueue/enqueue_tail allows merging of queues  Several modes Queue mode: data structures at discontiguous addresses Ring mode: data structures in a fixed-size array Journaling mode: keep previous values in a fixed-size array

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors Scratch Memory Recommended use  Passing messages between processors and between threads  Semaphores, mailboxes, other IPC 32-bit addressed (4 byte aligned, word aligned) 4 Kbytes Has an atomic autoincrement instruction  Only usable by microengines

2006 Carsten Griwodz & Pål HalvorsenINF5062 – programming asymmetric multi-core processors Scratchpad Special Features Atomic bit set and clear with/without test Atomic increment/decrement Atomic add and swap Atomic get/put for rings  Hardware support for rings links SRAM  Signaling when ring is full