MROD IDR 12 Nov. 2003 / JV 1 1.MROD-1 software 2.Testing 3.TigerSHARC in stead of the 21160?

Slides:



Advertisements
Similar presentations
Computer Architecture
Advertisements

System Integration and Performance
Accessing I/O Devices Processor Memory BUS I/O Device 1 I/O Device 2.
Computer Architecture
FPGA Configuration. Introduction What is configuration? – Process for loading data into the FPGA Configuration Data Source Configuration Data Source FPGA.
1/1/ / faculty of Electrical Engineering eindhoven university of technology Architectures of Digital Information Systems Part 1: Interrupts and DMA dr.ir.
I/O Unit.
Peter JansweijerMROD Design Review: November 12, 2003Slide 1 MROD-1 Hardware Overview MRODin MRODout MROD-1 = 3 x MRODin + 1 x MRODout.
FIU Chapter 7: Input/Output Jerome Crooks Panyawat Chiamprasert
Interfacing Processors and Peripherals Andreas Klappenecker CPSC321 Computer Architecture.
I/O Hardware n Incredible variety of I/O devices n Common concepts: – Port – connection point to the computer – Bus (daisy chain or shared direct access)
1 Computer System Overview OS-1 Course AA
04/16/2010CSCI 315 Operating Systems Design1 I/O Systems Notice: The slides for this lecture have been largely based on those accompanying an earlier edition.
22 February 2001ATLAS MDT Electronics PDR1 The MROD The MDT Precision Chambers ROD Adriaan König University of Nijmegen.
1 CSIT431 Introduction to Operating Systems Welcome to CSIT431 Introduction to Operating Systems In this course we learn about the design and structure.
26 February 2002ATLAS Muon Electronics Meeting1 MROD: the MDT Read Out Driver Status of MROD-1 Prototype Adriaan König University of Nijmegen.
5 October 20002nd ATLAS ROD Workshop1 The MROD The MDT Precision Chambers ROD Adriaan König University of Nijmegen.
Device Management.
11 November 2003ATLAS MROD Design Review1 The MROD The Read Out Driver for the ATLAS MDT Muon Precision Chambers Marcello Barisonzi, Henk Boterenbrood,
Timers and Interrupts Shivendu Bhushan Summer Camp ‘13.
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
INPUT/OUTPUT ARCHITECTURE By Truc Truong. Input Devices Keyboard Keyboard Mouse Mouse Scanner Scanner CD-Rom CD-Rom Game Controller Game Controller.
Input/Output. Input/Output Problems Wide variety of peripherals —Delivering different amounts of data —At different speeds —In different formats All slower.
Data and Computer Communications Chapter 8 – Multiplexing
Khaled A. Al-Utaibi  Intel Peripheral Controller Chips  Basic Description of the 8255  Pin Configuration of the 8255  Block Diagram.
Chapter 10: Input / Output Devices Dr Mohamed Menacer Taibah University
1 Computer System Overview Chapter 1. 2 n An Operating System makes the computing power available to users by controlling the hardware n Let us review.
Computer System Overview Chapter 1. Operating System Exploits the hardware resources of one or more processors Provides a set of services to system users.
MICROPROCESSOR INPUT/OUTPUT
FPGA IRRADIATION and TESTING PLANS (Update) Ray Mountain, Marina Artuso, Bin Gui Syracuse University OUTLINE: 1.Core 2.Peripheral 3.Testing Procedures.
1-1 Embedded Network Interface (ENI) API Concepts Shared RAM vs. FIFO modes ENI API’s.
2007 Oct 18SYSC2001* - Dept. Systems and Computer Engineering, Carleton University Fall SYSC2001-Ch7.ppt 1 Chapter 7 Input/Output 7.1 External Devices.
(More) Interfacing concepts. Introduction Overview of I/O operations Programmed I/O – Standard I/O – Memory Mapped I/O Device synchronization Readings:
Design and Performance of a PCI Interface with four 2 Gbit/s Serial Optical Links Stefan Haas, Markus Joos CERN Wieslaw Iwanski Henryk Niewodnicznski Institute.
ECS 152A 4. Communications Techniques. Asynchronous and Synchronous Transmission Timing problems require a mechanism to synchronize the transmitter and.
I/O management is a major component of operating system design and operation Important aspect of computer operation I/O devices vary greatly Various methods.
PROCStar III Performance Charactarization Instructor : Ina Rivkin Performed by: Idan Steinberg Evgeni Riaboy Semestrial Project Winter 2010.
Chapter 2 Introducing the PIC Mid-Range Family and the 16F84A The aims of this chapter are to introduce: The PIC mid-range family, in overview The overall.
Accessing I/O Devices Processor Memory BUS I/O Device 1 I/O Device 2.
Features of the new Alibava firmware: 1. Universal for laboratory use (readout of stand-alone detector via USB interface) and for the telescope readout.
ATtiny23131 A SEMINAR ON AVR MICROCONTROLLER ATtiny2313.
L/O/G/O Input Output Chapter 4 CS.216 Computer Architecture and Organization.
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
LHCb front-end electronics and its interface to the DAQ.
1 Lecture 1: Computer System Structures We go over the aspects of computer architecture relevant to OS design  overview  input and output (I/O) organization.
Input/Output Problems Wide variety of peripherals —Delivering different amounts of data —At different speeds —In different formats All slower than CPU.
IT3002 Computer Architecture
Computer operation is of how the different parts of a computer system work together to perform a task.
بسم الله الرحمن الرحيم MEMORY AND I/O.
Lecture Overview Shift Register Buffering Direct Memory Access.
1 Device Controller I/O units typically consist of A mechanical component: the device itself An electronic component: the device controller or adapter.
Interrupts and Exception Handling. Execution We are quite aware of the Fetch, Execute process of the control unit of the CPU –Fetch and instruction as.
1 Programming of FPGA in LiCAS ADC for Continuous Data Readout Week 4 Report Tuesday 22 nd July 2008 Jack Hickish.
UNIT – Microcontroller.
SLP1 design Christos Gentsos 9/4/2014.
Serial I/O and Data Communication.
An Introduction to Microprocessor Architecture using intel 8085 as a classic processor
Computer Architecture
The Read Out Driver for the ATLAS Muon Precision Chambers
CSCI 315 Operating Systems Design
I/O Systems I/O Hardware Application I/O Interface
Interrupt handling Explain how interrupts are used to obtain processor time and how processing of interrupted jobs may later be resumed, (typical.
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
ADSP 21065L.
Chapter 13: I/O Systems.
Presentation transcript:

MROD IDR 12 Nov / JV 1 1.MROD-1 software 2.Testing 3.TigerSHARC in stead of the 21160?

MROD IDR 12 Nov / JV 2 MRODin event fragment with header sent to MRODout via SHARC link Input Manager1 process() MRODin software structure (data flow) LinkDMA Controller Event Info Buffer1 Event Buffer Manager1 process() inform(…) release()next(…) release() push(…) next(…) release() checkEventArrived() setupEventDMA(…)checkEventSent(…) event length & id 1 event data 1 Event Data Buffer1 128k Input Manager2 Event Info Buffer2 Event Buffer Manager2 process() inform(…) release()next(…) release() push(…) next(…) release() checkEventArrived() event length & id 2 event data 2 Event Data Buffer2 128k process() OutputManager Separate data streams from 2 FPGAs via SHARC external bus Chained DMA Blocks represent C++ objects Endless DMA Endless DMA

MROD IDR 12 Nov / JV 3 while (running) { theInputManager1.process(); theEventBuffermanager1.process(); theInputManager2.process(); theEventBuffermanager2.process(); theOutputManager.process(); theControlManager.process(); } During normal running and for two active inputs this loop is executed: Interrupts signal fault conditions only and do not occur during regular operation. The Control Manager takes care of communication with a controller (via the MRODout) and of controlling state changes NB: "process" methods are "inline" methods

MROD IDR 12 Nov / JV 4 Event data may be overwritten if the buffer is not emptied fast enough. This can happen in case of an XOFF on the ROL or for many large events arriving directly after each other. When the available buffer size in the EventDataBuffer drops below a certain limit an error flag is raised and events need to be skipped (the latter is not yet implemented). Halting the transfer of event data from the FPGA to the SHARC appears at first sight attractive, as there may be more buffer space available in the ZBT RAM. However, recovering from an overflow in the ZBT RAM is not possible by skipping events in a controlled way, as is possible for preventing overflow in the buffer in the SHARC. (NB: in the MROD-1 it is not possible to halt the transfer of event data from the FPGA to the SHARC temporarily). Input Manager1 process() Event Info Buffer1 Event Buffer Manager1 process() inform(…) release()next(…) release() push(…) next(…) release() checkEventArrived() event length & id 1 event data 1 Event Data Buffer1 128k Endless DMA

MROD IDR 12 Nov / JV 5 Length and event id information 1 Event data 1 Input Manager1 Event Data Buffer1 56k Event Data Buffer4 56k Length and event id information 4 Event data 4 Input Manager4 OutputManager Event fragment with header sent to ROL GivePointerToFreeSpace ( spaceRequired) process() EventAvailable(streamId) GetPointerToEvent() ReleaseEvent() EventAvailable(streamId) DMA MRODOut software structure (data flow) Blocks represent C++ objects GivePointerToFreeSpace ( spaceRequired) GetPointerToEvent() Via one SHARC link ROLDMA Controller SetUpDMA()GetEventSent()Chained DMA, at max. 256 words per chain 4 x

MROD IDR 12 Nov / JV 6 while (running) { theInputManager1.process(); theInputManager2.process(); theInputManager3.process(); theOutputManager.process(); theControlManager.process(); } During normal running and for three active inputs a loop similar to the main loop of the MRODin is executed: Interrupts signal fault conditions only and do not occur during regular operation. The Control Manager takes care of communication with the crate controller and of controlling state changes

MROD IDR 12 Nov / JV 7 Length and event id information 1 Event data 1 Input Manager1 Event Data Buffer1 56k GivePointerToFreeSpace ( spaceRequired) EventAvailable(streamId) GetPointerToEvent() ReleaseEvent() DMA Via one SHARClink A DMA for transfer of event data is only started when enough space in the buffer is present (14.4 kByte is needed for at max. 100 words per TDC and 18 TDCs per CSM). Transfers across a SHARC link are subject to handshaking -> event data cannot be overwritten. Optimization is possible here with the help of fixed size DMA transfers or endless DMA and retrieval of the length and event id information from the buffer.

MROD IDR 12 Nov / JV 8 MRODout: at max. 256 words can be transferred as one block to the S-link, as the FIFO in the S-link interface has a size of 256 words. A write attempt to a full FIFO will result in a stall of the external bus until there is place again in the FIFO. There is a flag signalling that the FIFO is empty. -> On the basis of this flag blocks of at max. 256 words can be safely transferred. Under normal conditions the data leave the FIFO with the same speed (160 MByte/s) as they enter the FIFO. However, an XOFF on the ROL will cause the FIFO to fill (this caused problems last summer in the H8 testbeam, as a resulting bus hang prevents communication between the crate controller and the MRODout and may result in a VME-bus error).

MROD IDR 12 Nov / JV 9 Operational aspects The SHARCs are booted via the VME-bus, the MRODout SHARCs directly via their internal memory, the MRODin SHARCs via their links. All can, after booting, communicate with a server program running on the crate controller, which provides support for terminal and file I/O. These facilities cannot be used for high rate data-taking, but are very useful for debugging and for error reporting. Program development The Visual DSP++ environment running under Windows is needed for cross-compiling and linking and producing loadable files. Program testing in an environment simulating SHARC and FPGA specific features has proven to be very useful for initial program development and debugging and for bug hunting. This is done under MACOS X or Windows with the Metrowerks Codewarrior IDE exploiting its debugging and code navigation features.

MROD IDR 12 Nov / JV 10 Status software Working on transition to C++ on the basis of the software used in the H8 testbeam. This results in improved maintainability of the software, while optimization of the performance is no longer hampered by the lack of information hiding in the software written in C. Performance "H8 software", single channel input, output to S-link: about 70 kHz max. event rate C++ software, 6 channels, output to S-link, extrapolated: 72 kHz

MROD IDR 12 Nov / JV 11 One MRODin, one input channel, MRODout acting as data-sink, two words per TDC: 130 kHz One MRODin, two input channels, MRODout acting as data-sink, empty events (only headers and trailers): 75 kHz One MRODin, one input channel, MRODout with S-link output, empty events (only headers and trailers): > 104 kHz (event rate limted by generator) Two MRODins, one input channel, MRODout with S-link output, empty events (only headers and trailers): 87 kHz Three MRODins, one input channel, MRODout with S-link output, empty events (only headers and trailers): 72 kHz (1 output SHARC) MRODout program clearly can be optimized (e.g. chained DMA for output not yet implemented), further optimization of MRODin program also seems to be possible (very recent result indicates 95 kHz to be possible)

MROD IDR 12 Nov / JV Testing 1.With programmable data generators in laboratory, HW/SW functionality, including error handling / speed 2.In simulated environment: for debugging software. 3.With cosmic-ray setup: "real-life" testing with full DAQ chain, setup to be used for MDT testing, planned start: 1 February H8, 2003 and in combined testbeam 2004

MROD IDR 12 Nov / JV 13 Several programmable data generators are available: MRODout with appropriate software, with fixed data pattern an event rate up to 130 kHz is possible with correct parity bits, "CSMUX", FPGA based, fragment size per TDC Poisson distributed, 1 of each 256 fragments artificial large to simulate noise, very high frequency possible, driven by pulse generator, >> 100 kHz event rates possible. Three generators are available, more can be constructed. Errors can be introduced in the data (not in the present version), SHaSlink (PCI card with prevous generation Sharc) based, same software as for MRODout is used, but lower rates (~ 75 kHz). Rate can be increased if parity bits can be ignored. Five SHaSlink cards are available and can be synchronized to each other and to an MRODout functioning as data generator.

MROD IDR 12 Nov / JV 14 Many possibilities for testing under a variety of conditions, which are being exploited. NB: the data generated by the software generators tends to be very bursty, which can be helpful to find problems which may not show up when testing with identical time intervals between successive events. Additionally: the cosmic ray setup (5 chambers) can be used as high-rate data generator by lowering thresholds and triggering at random, for testing the MRODout the MRODins can be used in emulation mode.

MROD IDR 12 Nov / JV 15 Overview of results obtained so far with "CSMUX" for max. rate (in kHz) for Poisson distributed event size (average 4 words per TDC), truncated at the values indicated). For larger truncation values the bandwidth of 40 MByte/s of the link between MRODin and MRODout limits the event rate.

MROD IDR 12 Nov / JV 16 The processing resources of the SHARC DSPs used may be sufficient for sustaining event rates up to 100 kHz, but not much headroom is left (if any) for other tasks. Analog Devices has introduced the TigerSHARC as successor to the current SHARC series. A second generation TigerSHARC, the ADSP-TS20x series, is becoming available. These processors have clock frequencies of 500 / 600 MHz instead of the 80 MHz used for the SHARCs in the MROD-1. In addition a number of improvements have been made in the architecture and functionality of the device compared to the ADSP used in the MROD-1. Also the TigerSHARC has communication links (4 or 2) like the 21160, an MROD-2 design based on the TigerSharc could therefore be similar to the MROD-1 design. 3. TigerSHARCs in stead of the 21160?

MROD IDR 12 Nov / JV 17 TS201, TS202, TS TigerSHARC Internal memory512 kByte3, 1.5 or.5 MByte Links6 half duplex4 or 2 full duplex Max. link speed (1 direction) 100 MByte/s (practice: 40 MByte/s) 500 MByte/s Link technology 4 bits parallel with handshaking, single-ended 4 LVDS pairs per direction, with hand-shaking and optional error detection (8-bit checksum per 16 Bytes) Internal memory blocks 2, dual-ported SRAM, 64-bit wide (1 port connected via crossbar to 2 64-bit data busses, other port to peripheral bus) 6, DRAM, 128-bit wide (connected via crossbar to bit data busses) Link bootingonly via link 4via any link DMA from/to link to/from external bus? NoYes

MROD IDR 12 Nov / JV 18 Possible caveat: internal pipelines in TigerSHARC longer (10 cycle instruction pipeline) than in 21160, could result in smaller gain in performance than suggested by difference in clock frequencies. Timing for and behavior of TigerSHARC external bus differs from that of bus, links are different (signalling at 500 MHz is non-trivial). Getting all details right in an TigerSHARC based MROD-2 design may prove to be time- consuming. Furthermore the experience with the has shown that first versions of processors and development software may be buggy and also that Analog Devices can be slow with delivery of relatively bug-free processors. The TS-20x series is just coming on the market ("sampling"), and NIKHEF is only a small customer for Analog Devices....

MROD IDR 12 Nov / JV 19 We ordered a development kit with two TS-201s for evaluation of the possible gain in performance, delivery is expected in December.