Simple ideas on how to integrate L2CAL and L2XFT ---> food for thoughts Ted May 25th, 2007.

Slides:



Advertisements
Similar presentations
Computer System Organization Computer-system operation – One or more CPUs, device controllers connect through common bus providing access to shared memory.
Advertisements

Lecture 12 Reduce Miss Penalty and Hit Time
 Project Overview & System Integration Ted Liu June 11th, 2004 Fermilab, High Rise, Hornet Nest Pulsar Meeting.
Final Commissioning Plan and Schedule Cheng-Ju Lin Fermilab Installation Readiness Review 09/27/2004 Outline: - Work to be done during shutdown - Work.
Implementing A Simple Storage Case Consider a simple case for distributed storage – I want to back up files from machine A on machine B Avoids many tricky.
Pipeline transfer testing. The purpose of pipeline transfer increase the bandwidth for synchronous slave peripherals that require several cycles to return.
CS 408 Computer Networks Congestion Control (from Chapter 05)
Pulsar Teststand Room -- Pulsar was initially designed as a teststand tool In God we trust, everything else we test Ted for the Pulsar Group(s) With Pulsar.
What's inside a router? We have yet to consider the switching function of a router - the actual transfer of datagrams from a router's incoming links to.
Introduction Ted Liu, FNAL Feb. 9 th, 2005 L2 Pulsar 2rd IRR Review, ICB-2E, video: 82Pulsar
CS 300 – Lecture 22 Intro to Computer Architecture / Assembly Language Virtual Memory.
Summary Ted Liu, FNAL Feb. 9 th, 2005 L2 Pulsar 2rd IRR Review, ICB-2E, video: 82Pulsar
1 COMP 206: Computer Architecture and Implementation Montek Singh Wed, Nov 9, 2005 Topic: Caches (contd.)
A presentation by Angela Little SULI Program 8/4/04 Pulsar Boards and the Level II Trigger Upgrade at CDF.
1 Pulsar firmware status March 12th, 2004 Overall firmware status Pulsar Slink formatter Slink merger Muon Reces SVT L2toTS Transmitters How to keep firmware.
1 Pulsar firmware status June 11th, 2004 Slink format Transmitter firmware Transmitter firmware status Receiver firmware overview Receiver firmware status.
Reducing Cache Misses 5.1 Introduction 5.2 The ABCs of Caches 5.3 Reducing Cache Misses 5.4 Reducing Cache Miss Penalty 5.5 Reducing Hit Time 5.6 Main.
Using the Trigger Test Stand at CDF for Benchmarking CPU (and eventually GPU) Performance Wesley Ketchum (University of Chicago)
Joint Commissioning (2) Commissioning plan Gene Flanagan Purdue University.
1 Computer System Overview Chapter 1. 2 n An Operating System makes the computing power available to users by controlling the hardware n Let us review.
 Higher associativity means more complex hardware  But a highly-associative cache will also exhibit a lower miss rate —Each set has more blocks, so there’s.
SVT workshop October 27, 1998 XTF HB AM Stefano Belforte - INFN Pisa1 COMMON RULES ON OPERATION MODES RUN MODE: the board does what is needed to make SVT.
Burkard Reisert June 11 th, 2004 Fermilab, High Rise, Hornet Nest Pulsar Meeting Ted’s overview talk: Pulsar production/testing success !  all hardware.
Cluster Finder Report Laura Sartori (INFN Pisa) For the L2Cal Team Chicago, Fermilab, Madrid, Padova, Penn, Pisa, Purdue.
Design and Performance of a PCI Interface with four 2 Gbit/s Serial Optical Links Stefan Haas, Markus Joos CERN Wieslaw Iwanski Henryk Niewodnicznski Institute.
Technical Part Laura Sartori. - System Overview - Hardware Configuration : description of the main tasks - L2 Decision CPU: algorithm timing analysis.
ICOM 6115©Manuel Rodriguez-Martinez ICOM 6115 – Computer Networks and the WWW Manuel Rodriguez-Martinez, Ph.D. Lecture 14.
27 July 2006Trigger Upgrades Review1 Status & Plans of the TDWG Mission Statement What Do We have to Guide Us ? Caveats & Concerns Current & Anticipated.
News Ted & Kirsten June 10, 2005 TDWG. Meet Trigger Reps today? Exotic Trigger Reps: Vadim and Oscar doing great work QCD Trigger Rep: Mary herself EWK.
Commissioning Experience and Status Burkard Reisert (FNAL) L2 installation readiness review:
Recursion Opening Discussion zWhat did we talk about last class? zDo you have any questions about the assignment? Don’t let problems hang.
Some Thoughts about Hits, Geometry etc Rob Kutschke, Hans Wenzel Fermilab March 13, 2007.
JETT 2005 Session 5: Algorithms, Efficiency, Hashing and Hashtables.
L2 Upgrade review 19th June 2007Alison Lister, UC Davis1 XFT Monitoring + Error Rates Alison Lister Robin Erbacher, Rob Forrest, Andrew Ivanov, Aron Soha.
5/7/2004Tomi Mansikkala User guide for SVT/XTRP TX firmware v1.0 XTRP out Control FPGA Tomi: - Introduction - Control bit descriptions - Test Pattern format.
CH10 Input/Output DDDData Transfer EEEExternal Devices IIII/O Modules PPPProgrammed I/O IIIInterrupt-Driven I/O DDDDirect Memory.
PULSAR Specifications for RECES and ISOLIST ● Zeroth order start on “exploring” how a PULSAR scheme might look for isolation trigger and RECES ● Philosophy.
Precomputation- based Prefetching By James Schatz and Bashar Gharaibeh.
Project Overview Ted Liu Fermilab Sept. 27 th, 2004 L2 Pulsar upgrade IRR Review
Processor Structure and Function Chapter8:. CPU Structure  CPU must:  Fetch instructions –Read instruction from memory  Interpret instructions –Instruction.
09/01/2016James Leaver SLINK Current Progress. 09/01/2016James Leaver Hardware Setup Slink Receiver Generic PCI Card Slink Transmitter Transition Card.
Trigger Commissioning Workshop, Fermilab Monica Tecchio Aug. 17, 2000 Level 2 Calorimeter and Level 2 Isolation Trigger Status Report Monica Tecchio University.
Online monitor for L2 CAL upgrade Giorgio Cortiana Outline: Hardware Monitoring New Clusters Monitoring
New L2cal hardware and CPU timing Laura Sartori. - System overview - Hardware Configuration: a set of Pulsar boards receives, preprocess and merges the.
Pulsar Status For Peter. L2 decision crate L1L1 TRACKTRACK SVTSVT CLUSTERCLUSTER PHOTONPHOTON MUONMUON Magic Bus α CPU Technical requirement: need a FAST.
Input/Output Problems Wide variety of peripherals —Delivering different amounts of data —At different speeds —In different formats All slower than CPU.
L2TS and Plan for Integration Cheng-Ju Lin Fermilab Pulsar Meeting 06/11/2004.
L2toTS Status and Phase-1 Plan and Pulsar S-LINK Data Format Cheng-Ju Lin Fermilab L2 Trigger Upgrade Meeting 03/12/2004.
L2 CAL Status Vadim Rusu For the magnificent L2CAL team.
Joint Commissioning (1) Data size issues Gene Flanagan Purdue University.
ATLAS RoI Builder + CDF ● Brief reminder of ATLAS Level 2 ● Opportunities for CDF (what fits - what doesn't) ● Timescales (more of what fits and what doesn't)
LHCb upgrade Workshop, Oxford, Xavier Gremaud (EPFL, Switzerland)
COSC 3330/6308 Second Review Session Fall Instruction Timings For each of the following MIPS instructions, check the cycles that each instruction.
News and Related Issues Ted & Kirsten May 27, 2005 TDWG News since last meeting (April 29th) Organization Issues: how to improve communication Future Plans:
Memory Hierarchy— Five Ways to Reduce Miss Penalty.
Lecture 5 Page 1 CS 111 Summer 2013 Bounded Buffers A higher level abstraction than shared domains or simple messages But not quite as high level as RPC.
CSC 4250 Computer Architectures
Pulsar WG Meeting (C-J Lin)
Chapter 10 And, Finally... The Stack
Evolution of S-LINK to PCI interfaces
Main Memory Management
Virtual Memory Hardware
Morgan Kaufmann Publishers Memory Hierarchy: Cache Basics
L2 CPUs and DAQ Interface: Progress and Timeline
Congestion Control (from Chapter 05)
Update : about 8~16% are writes
Congestion Control (from Chapter 05)
The Road Warrior: first use of the Pulsar for SVT
Presentation transcript:

Simple ideas on how to integrate L2CAL and L2XFT ---> food for thoughts Ted May 25th, 2007

Outline Comment on the final L2 upgrade commission efforts Some basics on how FILAR/S32PCI64 works The problem with L2CAL and L2XFT integration due to large data package size Ideas on possible quick solutions

Comment on the final commissioning efforts We have a wonderful team: the L2 United team A strong/powerful team in the past two weeks, even with two key people (the SPLs, Vadim&Gene) on vacation It has been a real pleasure working with them in the past two weeks: people who spent most of their time on final commission in the past ~ two weeks

The L2 tough ladies

S32PCI64: single channel FILAR

The four-channel FILAR

S32PCI64: With transmitter firmware: --> Solar With Receiver firmware: --> single channel FILAR Four-channel FILAR (Rx)

CAL1 CAL2 CAL3 CAL4 Reces SVT Muon/XTRP/L1 Old cluster XFT1 XFT2 XFT3 L2 Decision Block: Conceptual view To L2TS Pulsar L2D + (TL2D if L2A) Control Node process New L2 decision node Spare node Solar Filar

Inside the Box decision node Spare node

data words CAL1 == 104 CAL2 == 104 CAL3 == 156 CAL4 == 104 Reces SVT Muon/XTRP/L1 Old cluster XFT1 > 120 XFT2 > 120 XFT3 > 120 So what is the the problem then? Both new data paths have large data volume: L2XFT (grow with luminosity, could be very large) + L2CAL (fixed length) To L2TS Pulsar L2D + (TL2D if L2A) Control Node process data packages fighting over 2 PCI buses, when traffic jam occurs, data wait in the input FIFO Input FIFO could overflow IF data packages too large ---> missing data words --> CPU detects this --> L2DTO Spare node

CAL1 CAL2 RECES CAL3 SVT Muon/XTRP/L1 XFT3 XFT1 XFT2 Old Cluster CAL4 The actual setup with beam  had problem when both L2CAL+L2XFT are included  show up as XFT data package corruption --> L2 DTO  no problem when only L2CAL is included PCI bus 1 PCI bus 2

CAL1 == 104 CAL2 == 104 RECES CAL3 == 156 SVT Muon/XTRP/L1 XFT3 XFT1 XFT2 Old Cluster CAL4 == 104 with L2CAL only, i.e. without XFT: --> do not expect FIFO overflow problem PCI bus 1: on this solar bus, the worst case is likely when L2A with TL2D bank and scaler info sending out from solar… up to ~ 1K words SLINK package… During the time this package is transferring over PCI bus 1, the largest CAL3 data has to wait in the input FIFO. But the maximal number of input events is 3 (since 1 is already on its way L2Aed). 3 x 156 = 468 < 512 FIFO depth… impossible to fill up the FIFO. PCI bus 2: worst case is 4 L1A in a row… The largest data package is CAL4 == 104 CAL4: 104 x 4 = 416 < 512 No problem

Can we solve this problem? The Pulsar-based system is very flexible  zero-suppression, fast abort, enable back-pressure, use dedicated PC for dedicated buffer event … etc Many ways to deal with this, but I will ONLY talk about the simplest possible solutions…(with the potential to be done within weeks, not months). Will only show two simple ideas today  (1) let large XFT data have higher priority over PCI bus  (2) use another PC for the XFT data, to do fast-abort or data suppression, ROI (Region-Of-Interest)…

How to let XFT data packages have higher priority over the PCI bus? Current way of PCI handling for each data channel:  Send requests for up to 4 (effectively) buffer events at begin of run  This means that not only the current event package, but also packages from next event(s) could compete over the PCI bus if they arrive early: in other words, whoever arrives early gets the bus…  This made sense in the past kuz algorithm time is almost zero, and all data path had small size How to let largest data path have highest priority? This can be done by simply playing tricks with the request FIFO:  Only send requests for up to 4 buffer events for largest data paths at the beginning of run. Do not send request for all other (smaller) data paths  For each event, only after the largest packages arrived to PC memory, send request for the rest of data paths, but only for the current event. Before that, all other smaller data packages will have to wait in their input FIFO: not allowed to fight over the PCI bus. This should reduce the possibility of FIFO overflow, but may not fix the problem in a fundamental way (as luminosity increases), and likely delay L2CAL path. But it is the simplest thing to do…this will be done anytime soon…

data words CAL1 == 104 CAL2 == 104 CAL3 == 156 CAL4 == 104 Reces SVT Muon/XTRP/L1 Old cluster XFT1 > 120 XFT2 > 120 XFT3 > 120 There is simple way to possibly solve the problem: use the spare PC to receive XFT data and do “zero-suppression” in software … then send to main PC … This sounds more complicated than it actually is … To L2TS Pulsar L2D + (TL2D if L2A) Control Node Process controls both PCs L2 decision node Simple implementation: Send empty package if no need for Stereo tracking Send aw xft data when needed (based on L1 bits) Could also do ROI More involved implementation: Do 3D tracking right here for tracks about certain Pt …. your wild ideas here,,, Solar Filar (Single channel)Filar

data words CAL1 == 104 CAL2 == 104 CAL3 == 156 CAL4 == 104 Reces SVT Muon/XTRP/L1 Old cluster XFT1 > 40 XFT2 > 160 XFT3 > 160 Load balancing Spare PC could use 2 single channel Filars (2048 input FIFO depth) to receive the largest XFT packages… The decision node could also use single channel Filar to receive the xft package. To L2TS Pulsar L2D + (TL2D if L2A) Control Node Process control both PCs L2 decision node Solar Filar (Single channel)Filar Solar Filar (single channel)Filar The spare PC

This all looks more complicated than it is --- so what is really involved then? Step 1: how to ask the existing control node process to also control the spare PC?  Daniel’s design of the control node can naturally handle multiple PCs (a L3 based design)  In Daniel’s words: “just use the CardEditor to change number of PCs from 1 to 2, then input the spare PC IP address”. Step 2: what do we have to change for the existing code for L2 decision node?  Nothing, if one simply use the spare PC to do fast abort based on L1 bits  It doesn’t know whether the XFT package is from a PC or a Pulsar merger Step 3: move XFT fibers from the decision node to the spare node, as well as the copy of muon/xtrp/L1 fiber. Step 4: could use single channel Filars if so desired (software is the same) Step 5: copy over the decision node software to the spare PC, remove most of the algorithm stuff, do fast-abort here. If the event doesn’t need stereo tracking, send empty package. Otherwise, send the combined raw data. Later on, could add more functions for zero-suppression, such as ROI. Or, if you really want, simply do stereo tracking right here in spare PC… e.g. for muons … Your wild idea here… We may want to order 2 more PCs soon, as spares…

System commissioning is fun … I am leaving tomorrow, for two weeks This is just one of the ideas I have had Didn’t want to talk about it, but young people kept asking Meant as food for thoughts, to encourage young people to brainstorm Our system is more flexible than we think About commissioning: To have a system more or less working is not hard, To have zero error is hard, To keep it that way is even harder… Whatever you do, think about how to win the peace afterwards…