Fault-Tolerant Delay-Insensitive Inter-Chip Communication Yebin Shi Apt Group The University of Manchester.

Slides:



Advertisements
Similar presentations
Nick Feamster CS 4251 Computer Networking II Spring 2008
Advertisements

Chapter 3 The Data Link Layer.
The Data Link Layer Chapter 3. Data Link Layer Design Issues Services Provided to the Network Layer Framing Error Control Flow Control.
Contents Overview Data Information Frame Format Protocol
Self-Timed Logic Timing complexity growing in digital design -Wiring delays can dominate timing analysis (increasing interdependence between logical and.
Self-Stabilizing End-to-End Communication in Bounded Capacity, Omitting, Duplicating and non-FIFO Dynamic Networks Shlomi Dolev 1, Ariel Hanemann 1, Elad.
Data Synchronization Issues in GALS SoCs Rostislav (Reuven) Dobkin and Ran Ginosar Technion Christos P. Sotiriou FORTH ICS- FORTH.
CSCI 4550/8556 Computer Networks Comer, Chapter 6: Long Distance Communication (Carriers, Modulation, And Modems)
Southampton: Oct 99AMULET3i - 1 AMULET3i - asynchronous SoC Steve Furber - n Agenda: AMULET3i Design tools Future problems.
Gigabit Ethernet Group 1 Harsh Sopory Kaushik Narayanan Nafeez Bin Taher.
Physical Layer CHAPTER 3. Announcements and Outline Announcements Credit Suisse – Tomorrow (9/9) Afternoon – Student Lounge 5:30 PM Information Session.
1 Asynchronous Bit-stream Compression (ABC) IEEE 2006 ABC Asynchronous Bit-stream Compression Arkadiy Morgenshtein, Avinoam Kolodny, Ran Ginosar Technion.
11-May-04 Qianyi Zhang School of Computer Science, University of Birmingham (Supervisor: Dr Georgios Theodoropoulos) A Distributed Colouring Algorithm.
1 Clockless Logic Montek Singh Thu, Jan 13, 2004.
1 Clockless Logic Montek Singh Tue, Mar 23, 2004.
COMP Clockless Logic and Silicon Compilers Lecture 3
Tomas Bengtsson 1 Test Research at Jönköping University Tomas Bengtsson.
Embedded Systems Laboratory Informatics Institute Federal University of Rio Grande do Sul Porto Alegre – RS – Brazil SRC TechCon 2005 Portland, Oregon,
EEC-484/584 Computer Networks Lecture 13 Wenbing Zhao
Host Data Layer 7 Application Interacts with software requiring network communications; identifies partners, resources and synchronization Layer 6 Presentation.
Chapter 6 Errors, Error Detection, and Error Control
1 Indirect Adaptive Routing on Large Scale Interconnection Networks Nan Jiang, William J. Dally Computer System Laboratory Stanford University John Kim.
20101 The Data Link Layer Chapter Design Issues Controls communication between 2 machines directly connected by “wire”-like link Services Provided.
Adapted from Tanenbaum's Slides for Computer Networking, 4e The Data Link Layer Chapter 3.
Error Detection and Reliable Transmission EECS 122: Lecture 24 Department of Electrical Engineering and Computer Sciences University of California Berkeley.
SERIAL BUS COMMUNICATION PROTOCOLS
VERIFICATION OF I2C INTERFACE USING SPECMAN ELITE By H. Mugil Vannan Experts Mr. Rahul Hakhoo, Section Manager, CMG-MCD Mr. Umesh Srivastva, Project Leader.
12004 MAPLD: 141Buchner Single Event Effects Testing of the Atmel IEEE1355 Protocol Chip Stephen Buchner 1, Mark Walter 2, Moses McCall 3 and Christian.
Final Year Project A CMOS imager with compact digital pixel sensor (BA1-08) Supervisor: Dr. Amine Bermak Group Members: Chang Kwok Hung
RTS/CTS-Induced Congestion in Ad Hoc Wireless LANs Saikat Ray, Jeffrey B. Carruthers, and David Starobinski Department of Electrical and Computer Engineering.
Winter 2008CS244a Handout 121 CS244a: An Introduction to Computer Networks Handout 12: Physical Layer Sending 1’s and 0’s, Capacity and Clocking Nick McKeown.
CS 640: Introduction to Computer Networks Aditya Akella Lecture 5 - Encoding and Data Link Basics.
1 SERIAL PORT INTERFACE FOR MICROCONTROLLER EMBEDDED INTO INTEGRATED POWER METER Mr. Borisav Jovanović, Prof.dr Predrag Petković, Prof.dr. Milunka Damnjanović,
Amitava Mitra Intel Corp., Bangalore, India William F. McLaughlin
Lecture 20: Communications Lecturers: Professor John Devlin Mr Robert Ross.
Dr. Carl R. Nassar, Dr. Zhiqiang Wu, and David A. Wiegandt RAWCom Laboratory Department of ECE.
High-Level Interconnect Architectures for FPGAs An investigation into network-based interconnect systems for existing and future FPGA architectures Nick.
A Simple Neighbor Discovery Procedure for Bluetooth Ad Hoc Networks Miklós Aurél Rónai and Eszter Kail GlobeCom 2003 Speaker: Chung-Hsien Hsu Presented.
DEVICES AND COMMUNICATION BUSES FOR DEVICES NETWORK
Computer Communication & Networks Lecture # 05 Physical Layer: Signals & Digital Transmission Nadeem Majeed Choudhary
© 2009 Pearson Education Inc., Upper Saddle River, NJ. All rights reserved. 1 Communication Reliability Asst. Prof. Chaiporn Jaikaeo, Ph.D.
George Michelogiannakis William J. Dally Stanford University Router Designs for Elastic- Buffer On-Chip Networks.
Copyright © Silistix, all rights reserved Glitch Sensitivity and Defense of QDI NoC Links Sean Salisbury 18 May 2009.
Physical and Link Layers Brad Karp UCL Computer Science CS 6007/GC15/GA07 23 rd February, 2009.
Synthesis Of Fault Tolerant Circuits For FSMs & RAMs Rajiv Garg Pradish Mathews Darren Zacher.
Yun-Chung Yang SimTag: Exploiting Tag Bits Similarity to Improve the Reliability of the Data Caches Jesung Kim, Soontae Kim, Yebin Lee 2010 DATE(The Design,
12006 MAPLD International ConferenceSpaceWire 101 Seminar Time-code Enhancements for SpaceWire Barry M Cook Presented by Paul Walker Both of 4Links 2006.
Data Communications & Computer Networks, Second Edition1 Chapter 6 Errors, Error Detection, and Error Control.
12004 MAPLD: 153Brej Early output logic and Anti-Tokens Charlie Brej APT Group Manchester University.
Hrushikesh Chavan Younggyun Cho Structural Fault Tolerance for SOC.
Predictive High-Performance Architecture Research Mavens (PHARM), Department of ECE The NoX Router Mitchell Hayenga Mikko Lipasti.
Mr. Sathish Kumar. M Department of Electronics and Communication Engineering I’ve learned that people will forget what you said, people will forget what.
Status and Plans for Xilinx Development
Recap of Layers Application, Data Link and Physical.
Calliope-Louisa Sotiropoulou FTK: E RROR D ETECTION AND M ONITORING Aristotle University of Thessaloniki FTK WORKSHOP, ALEXANDROUPOLI: 10/03/2014.
Serial Communications
LBDS TSU & AS-I failure report (Sept. 2016)
Environment Temperature Monitor
Lecture 13 Derivation of State Graphs and Tables
Recap: Lecture 1 What is asynchronous design? Why do we want to study it? What is pipelining? How can it be used to design really fast hardware?
Interconnection Networks: Flow Control
Maintaining Data Integrity in Programmable Logic in Atmospheric Environments through Error Detection Joel Seely Technical Marketing Manager Military &
Long-Distance Communication (Carriers, Modulation, And Modems)
Chapter 5 Peer-to-Peer Protocols and Data Link Layer
NET301 Lecture 5 10/18/2015 Lect5 NET301.
Reliability and Error Control 5/17/11
Reduction in synchronisation in bundled data systems
Chapter 5 Peer-to-Peer Protocols and Data Link Layer
Early output logic and Anti-Tokens
Preliminary design of the behavior level model of the chip
Presentation transcript:

Fault-Tolerant Delay-Insensitive Inter-Chip Communication Yebin Shi Apt Group The University of Manchester

Outline SpiNNaker Inter-Chip interconnect Basic Transmitter and Receiver Potential Problems with the Designs Robust Transmitter and Receiver Future work and conclusion

Research Aims Investigate the impact of transient glitches at inter-chip wires on the interface circuits. Redesign the link interface circuits to increase glitch-resistance and avoid deadlock.

SpiNNaker Network infrastructure: – 6 bidirectional inter-chip links – delay-insensitive on-chip and inter-chip communication – Packets are variable-length, serialized in 4-bit flits, with end-of-packet marker – 1 Gb/s throughput per link

Inter-Chip Communication Inter-Chip Network: – 2of7 data encoding – 2-phase (NRZ) handshake – data and control in single stream On-Chip Network: – 3of6 data encoding – 4-phase (RTZ) handshake – separate data and control channels

Link Transmitter - data channel: pipeline for code and phase conversion - ctrl channel: merge EoP symbol into the data stream

Link Receiver - data channel: phase and code conversion pipeline - ctrl channel: Extract EoP symbols from stream

Glitch Impact on Simulation Automatic packet data generation CRC scheme included for result verification Random generation of transient glitches –injected onto the inter-chip link –Single Event Upset (SEU) scenarios Configurable frequency and duration of glitches –Frequency: up to ½ glitch/packet –duration scale: ns Extensive simulation –a large number of densely packed glitches over 1M packets –speed-up fault simulation

Fault effects in the Transmitter Deadlock risks: – A transient glitch may corrupt a 2-of-7 symbol, leading to handshaking failure. – Phase-sensitive phase converter. – Independent reseting.

Fault Effects in the Receiver Deadlock risks: – A corrupted 2-of-7 symbol may prevent completion of conversion to 3of6. – Independent reseting.

Deadlock in Receiver - a glitch occurs when dout_cd is in transit - a wrong value stored in the bottom latch - a conversion failure for next data conversion

Robust 2-ph to 4-ph Conversion phase-insensitive converter: – Used in 2-phase ack input to the Transmitter. – Used in 2-phase data inputs to the Receiver. reset signal not shown

Robust Receiver Design –Phase-insensitive phase converter –Enhanced code converter and completion detector –Independent reset capability

Receiver Phase Converter acki also triggers the ack signal back to the transmitter

Code conversion with Priority Arbitration – support full set of 2-of-7 code – convert invalid symbols into a valid one – stop propagation of invalid symbols containing more than 2 transitions

Independent Reset –An extra, possibly redundant, transition is created after reset in case the Tx is waiting for an acknowledge token. –The phase-insensitive converter for ack2 in TX absorbs the extra token if it is not needed.

Simulation results Simulation results for 1 million packets sent Items \ DesignsOriginal I/F Proposed I/F Glitches478,280390,357 Successfully Received Packets 916,684863,182 Deadlock7,6327 Performance (ns/symbol) 1715 Area(um 2 ) – Significantly reduced deadlock occurrence. – worse packet loss. – trivial area overhead. – increased throughput.

Conclusions and Future work Enhance the resistance to transient glitches in inter-chip links by replacing phase converters. Avoid deadlocks by hardening completion detection modules in the receiver. Remove corrupt symbols by applying an arbitration scheme for symbol conversions. Allow independent chip resets without introducing deadlocks by sending safe, possibly redundant tokens (data or ack) on reset. A generalized approach for circuit evaluation, including the computation of safety margins. Investigation into the impact of back-pressure on glitch resistance.