Digital Video Cluster Simulation Martin Milkovits CS699 – Professional Seminar April 26, 2005.

Slides:



Advertisements
Similar presentations
System Area Network Abhiram Shandilya 12/06/01. Overview Introduction to System Area Networks SAN Design and Examples SAN Applications.
Advertisements

Traffic and routing. Network Queueing Model Packets are buffered in egress queues waiting for serialization on line Link capacity is C bps Average packet.
NetFPGA Project: 4-Port Layer 2/3 Switch Ankur Singla Gene Juknevicius
Spring 2000CS 4611 Introduction Outline Statistical Multiplexing Inter-Process Communication Network Architecture Performance Metrics.
CS 4700 / CS 5700 Network Fundamentals Lecture 7: Bridging (From Hub to Switch by Way of Tree) Revised 1/14/13.
Chapter 10 Congestion Control in Data Networks1 Congestion Control in Data Networks and Internets COMP5416 Chapter 10.
COMMUNICATION TECHNOLOGY by Shashi Bhushan School of Computer and Information Sciences.
Router Architecture : Building high-performance routers Ian Pratt
I/O Channels I/O devices getting more sophisticated e.g. 3D graphics cards CPU instructs I/O controller to do transfer I/O controller does entire transfer.
Network based System on Chip Part A Performed by: Medvedev Alexey Supervisor: Walter Isaschar (Zigmond) Winter-Spring 2006.
t Popularity of the Internet t Provides universal interconnection between individual groups that use different hardware suited for their needs t Based.
FF-1 9/30/2003 UTD Practical Priority Contention Resolution for Slotted Optical Burst Switching Networks Farid Farahmand The University of Texas at Dallas.
Computer Network Architecture and Programming
CS 268: Lecture 12 (Router Design) Ion Stoica March 18, 2002.
1 Lecture 24: Interconnection Networks Topics: communication latency, centralized and decentralized switches (Sections 8.1 – 8.5)
EE 4272Spring, 2003 Chapter 14 LAN Systems Ethernet (CSMA/CD)  ALOHA  Slotted ALOHA  CSMA  CSMA/CD Token Ring /FDDI Fiber Channel  Fiber Channel Protocol.
Networking and Internetworking Devices Networks and Protocols Prepared by: TGK First Prepared on: Last Modified on: Quality checked by: Copyright 2009.
1 Input/Output Chapter 3 TOPICS Principles of I/O hardware Principles of I/O software I/O software layers Disks Clocks Reference: Operating Systems Design.
Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles.
1 K. Salah Module 4.3: Repeaters, Bridges, & Switches Repeater Hub NIC Bridges Switches VLANs GbE.
Storage area network and System area network (SAN)
Bandwidth Estimation: Metrics Mesurement Techniques and Tools By Ravi Prasad, Constantinos Dovrolis, Margaret Murray and Kc Claffy IEEE Network, Nov/Dec.
 I/O channel ◦ direct point to point or multipoint comms link ◦ hardware based, high speed, very short distances  network connection ◦ based on interconnected.
Router Architectures An overview of router architectures.
4: Network Layer4b-1 Router Architecture Overview Two key router functions: r run routing algorithms/protocol (RIP, OSPF, BGP) r switching datagrams from.
Chapter 4 Queuing, Datagrams, and Addressing
Kristian Naess Qicai Guo Roy Torres Mark Bacchus Yue Kun Alberto Chestaro.
1 Input/Output. 2 Principles of I/O Hardware Some typical device, network, and data base rates.
Paper Review Building a Robust Software-based Router Using Network Processors.
Introduction to Interconnection Networks. Introduction to Interconnection network Digital systems(DS) are pervasive in modern society. Digital computers.
CDA 3101 Fall 2013 Introduction to Computer Organization I/O Devices and Buses 15 November 2013.
1/29/2002 CS Distributed Systems 1 Infiniband Architecture Aniruddha Bohra.
Hardware Design of High Speed Switch Fabric IC. Overall Architecture.
Brierley 1 Module 4 Module 4 Introduction to LAN Switching.
Exercise 2 The Motherboard
Towards a Common Communication Infrastructure for Clusters and Grids Darius Buntinas Argonne National Laboratory.
Module 2: Information Technology Infrastructure
Local Area Networks: Internetworking
QoS Support in High-Speed, Wormhole Routing Networks Mario Gerla, B. Kannan, Bruce Kwan, Prasasth Palanti,Simon Walton.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Design and Implementation of a Multi-Channel Multi-Interface Network Chandrakanth Chereddi Pradeep Kyasanur Nitin H. Vaidya University of Illinois at Urbana-Champaign.
Univ. of TehranAdv. topics in Computer Network1 Advanced topics in Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.
A Measurement Based Memory Performance Evaluation of High Throughput Servers Garba Isa Yau Department of Computer Engineering King Fahd University of Petroleum.
BUS IN MICROPROCESSOR. Topics to discuss Bus Interface ISA VESA local PCI Plug and Play.
A.SATHEESH Department of Software Engineering Periyar Maniammai University Tamil Nadu.
Link Layer Review CS244A Winter 2008 March 7, 2008 Ben Nham.
Integrating New Capabilities into NetPIPE Dave Turner, Adam Oline, Xuehua Chen, and Troy Benjegerdes Scalable Computing Laboratory of Ames Laboratory This.
Infiniband Bart Taylor. What it is InfiniBand™ Architecture defines a new interconnect technology for servers that changes the way data centers will be.
McGraw-Hill©The McGraw-Hill Companies, Inc., 2004 Connecting Devices CORPORATE INSTITUTE OF SCIENCE & TECHNOLOGY, BHOPAL Department of Electronics and.
PaxComm Co. Ltd. 라우터 / 스위치 Chipset ㈜ 팍스콤. PaxComm Co. Ltd. 백 영식 2 목차 1. Layer 2, Layer 3 switching 2. Switching Chip architectures 3. Galileo-I architecture.
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 April 11, 2006 Session 23.
Supporting Multimedia Communication over a Gigabit Ethernet Network VARUN PIUS RODRIGUES.
Chapter 5 Input/Output 5.1 Principles of I/O hardware
Sep. 17, 2002BESIII Review Meeting BESIII DAQ System BESIII Review Meeting IHEP · Beijing · China Sep , 2002.
Local-Area Networks. Topology Defines the Structure of the Network – Physical topology – actual layout of the wire (media) – Logical topology – defines.
An Efficient Gigabit Ethernet Switch Model for Large-Scale Simulation Dong (Kevin) Jin.
CCNA3 Module 4 Brierley Module 4. CCNA3 Module 4 Brierley Topics LAN congestion and its effect on network performance Advantages of LAN segmentation in.
Physical Layer Issues and Methods Outline Physical Layer Ethernet Technology Physical Layer Encoding Final Exam Review - ??
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Univ. of TehranIntroduction to Computer Network1 An Introduction to Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.
What is a Bus? A Bus is a communication system that transfers data between components inside a computer or between computers. Collection of wires Data.
Computer Networks & Digital Lab project. In cooperation with Mellanox Technologies Ltd. Guided by: Crupnicoff Diego & Gurewitz Omer. Students: Cohen Erez,
CHAPTER -II NETWORKING COMPONENTS CPIS 371 Computer Network 1 (Updated on 3/11/2013)
Univ. of TehranIntroduction to Computer Network1 An Introduction to Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.
Intro to PCI 2002 년 1 월 21 일 신 준 헌. Contents PCI bus features General PCI bus model PCI Device vs. Function Specifications.
Busses. Peripheral Component Interconnect (PCI) bus architecture The PCI bus architecture is a processor-independent bus specification that allows peripherals.
Ethernet Packet Filtering – Part 2 Øyvind Holmeide 10/28/2014 by.
Infiniband Architecture
Packet Switch Architectures
Chapter 3 Part 3 Switching and Bridging
Presentation transcript:

Digital Video Cluster Simulation Martin Milkovits CS699 – Professional Seminar April 26, 2005

Goal of Simulation Build an accurate performance model of the interconnecting fabrics in a Digital Video cluster Build an accurate performance model of the interconnecting fabrics in a Digital Video cluster Assumptions Assumptions RAID Controller would follow a triangular distribution of I/O interarrival times RAID Controller would follow a triangular distribution of I/O interarrival times Gigabit Ethernet IP edge card would not impress any backpressure on the I/Os Gigabit Ethernet IP edge card would not impress any backpressure on the I/Os

Fabrics Simulated Table 1: Connection Technologies FabricType Per Link /Bus Actual Bandwidth (Gbps) Hardware Device BuffersPorts PCI MHz/ 64bit Parallel – bridged 3.934n/an/an/a StarFabric (SF) Full Duplex Serial 1.77 StarGen 2010 Bridge Per SF Port and PCI 2 – StarFabric 1 – 64/66 PCI StarGen 1010 Switch Per SF Port 6 – StarFabric InfiniBand (IB) Full Duplex Serial 2.0 Mellanox Bridge / Switch Per IB port and PCI 8 – 1X InfiniBand 1 – 64/66 PCI

Digital Video Cluster

Digital Video Node

Modules, Connections and Messages Messages represent data packets AND are used to control the model Messages represent data packets AND are used to control the model For data packets – have a non-zero length parameter For data packets – have a non-zero length parameter Contain routing and source information Contain routing and source information Modules handle message processing and routing Modules handle message processing and routing By and large represent hardware in the system By and large represent hardware in the system PCI Bus module – not actual hardware, but necessary to simulate a bus architecture PCI Bus module – not actual hardware, but necessary to simulate a bus architecture Connections allow messages to flow between modules Connections allow messages to flow between modules represent links/busses represent links/busses Independent connections for data vs. control messages Independent connections for data vs. control messages May be configured with a data rate value to simulate transmission delay May be configured with a data rate value to simulate transmission delay

Managing Buffer/Bus access Before transferring a data message (RWM) Need to gain access to transfer link/bus and destination buffer

PCI Bus Challenges Maintain Bus fairness Maintain Bus fairness Allow multiple PCI bus masters to interleave transactions (account for retry overhead) Allow multiple PCI bus masters to interleave transactions (account for retry overhead) Allow bursting if only one master Allow bursting if only one master

PCI Bus Module Components Queue – pending RWM’s Queue – pending RWM’s pciBus[maxDevices] array – utilization key pciBus[maxDevices] array – utilization key reqArray[maxDevices] – pending rqst messages reqArray[maxDevices] – pending rqst messages Work area – manages RWM actually being transferred by the PCI bus Work area – manages RWM actually being transferred by the PCI bus 3 Message types to handle 3 Message types to handle rqst messages from PCI bus masters rqst messages from PCI bus masters RMW messages RMW messages qCheck self-messages qCheck self-messages

Handling rqst and RWM messages When RWM finally hits the work area When RWM finally hits the work area Set RMW.transfer value = length of message (1024) Set RMW.transfer value = length of message (1024) Schedule qCheck self-message to fire in 240ns (time to transfer 128bits) Schedule qCheck self-message to fire in 240ns (time to transfer 128bits)

Handling qCheck Messages

Determining Max Bandwidth

Simulation Ramp-up

120MBps Results NodeRAID Sample Mean 90 % Confidence Interval

Contention / Utilization / Capacity

Learning Experiences PCI Contention PCI Contention First as a link like any other maintained by the StarGen chip First as a link like any other maintained by the StarGen chip Buffer contention and access Buffer contention and access Originally used retry loops – like actual system - way too much processing time! Originally used retry loops – like actual system - way too much processing time! Retry messages that are returned are a natural design given the language of messages and connections. Retry messages that are returned are a natural design given the language of messages and connections.

Conclusion / Future Work Simulation performed within 7% of actual system performance Simulation performed within 7% of actual system performance PCI bus between IB and StarGen potential hotspot PCI bus between IB and StarGen potential hotspot Complete more iterations with minor system modifications (dualDMA, scheduling) Complete more iterations with minor system modifications (dualDMA, scheduling) Submitted paper to the Winter Simulation Conference Submitted paper to the Winter Simulation Conference