Presentation is loading. Please wait.

Presentation is loading. Please wait.

ANDREA NEGRI, INFN PAVIA – NUCLEAR SCIENCE SYMPOSIUM – ROME 20th October 2004 1 01 11 010 001 1101 1110 11001 01011 110110 001101 1111111 0111000 11101010.

Similar presentations


Presentation on theme: "ANDREA NEGRI, INFN PAVIA – NUCLEAR SCIENCE SYMPOSIUM – ROME 20th October 2004 1 01 11 010 001 1101 1110 11001 01011 110110 001101 1111111 0111000 11101010."— Presentation transcript:

1 ANDREA NEGRI, INFN PAVIA – NUCLEAR SCIENCE SYMPOSIUM – ROME 20th October 2004 1 01 11 010 001 1101 1110 11001 01011 110110 001101 1111111 0111000 11101010 01001110 110111001 000101101 1111010001 0101111100 111101001111 010110000101 H t W Z  0 Andrea Negri, INFN Pavia on behalf of the ATLAS HLT Group

2 ANDREA NEGRI, INFN PAVIA – NUCLEAR SCIENCE SYMPOSIUM – ROME 20th October 2004 2  Level 1 trigger Hardware based Coarse granularity calo/muon data  Event Filter Full event access “Seeded” by LVL2 result Algorithms inherited from offline  Level 2 trigger Detector sub-region processed Full granularity for all subdetectors Fast rejection steering 40 MHz ~75 kHz ~ 2  s ~2 kHz ~ 10 ms ~ 1 s ~200 Hz Muon ROD LVL1 CaloInner Pipeline Memories Readout Drivers RatesLatency RoI ATLAS T/DAQ system LVL2 Event builder network Storage: ~ 300 MB/s ROBROBROB Readout Buffers ~1600 EF farm ~1000 CPUs 1 selected event every million TDAQ ( )  EF FUNCTIONAL TESTS DEPLOYMENTCONCLUSIONSINTRODUCTIONDESIGN CM energy14 TeV Luminosity10 34 cm -2 s -1 Collision rate40 MHz Event rate ~ 1 GHz Detector channels~ 10 8

3 ANDREA NEGRI, INFN PAVIA – NUCLEAR SCIENCE SYMPOSIUM – ROME 20th October 2004 3 A common framework for offline and online and similar reconstruction algorithms Avoids duplication of work Simplify performance/validation studies Avoid selection biases Common database access tools General requirements Scalability, flexibility and modularity Hardware independence in order to follow technology trends Reliability and fault tolerance Avoid data losses Could be critical: EF algorithms inherited from the offline ones EF SFI SFO SFI SFO SFI SFO SFI SFO SubFarm Input SubFarm Output EF SubFarm Event Filter system: Constraints and Requirements The computing instrument of the EF is organized as a set of independent subFarms, connected to different output ports of the EB switch Possibility to partition the EF resources and run multiple concurrent DAQs instances (e.g.: calibration and commissioning purposes) Event builder network Storage Read out system FUNCTIONAL TESTS DEPLOYMENTCONCLUSIONSINTRODUCTIONDESIGN

4 ANDREA NEGRI, INFN PAVIA – NUCLEAR SCIENCE SYMPOSIUM – ROME 20th October 2004 4 data processing    data flow functionalities Design features Read out system Each processing node manages its own connection with the SFI and SFO elements that implement the server part of the communication protocol Allows dynamic insertion/removal of sub-farms in the EF or of processing hosts in a sub-farm Allows geographically distributed implementations Supports multiple SFI connections: dynamic re-routing in case of SFI malfunction (depends on the network topology) Avoids single point of failure: a faulty processing host do not interfere with the operations of other sub-farm elements In order to assure data security in case of event processing problems the design has been based on the decoupling between: SFI SFO SFI SFO SFI SFO SFI SFO Event builder network Storage Remote Farm FUNCTIONAL TESTS DEPLOYMENTCONCLUSIONSINTRODUCTIONDESIGN

5 ANDREA NEGRI, INFN PAVIA – NUCLEAR SCIENCE SYMPOSIUM – ROME 20th October 2004 5 In each EF processing host Data flow functionalities are provided by the Event Filter Dataflow process that: Manages the communication with SFI and SFO Stores the events during their transit in the Event Filter Makes the events available to the Processing Tasks that perform the data processing and event selection operations running the EF algorithms in the standard ATLAS offline framework A pluggable interface (PTIO) allows PTs to access the dataFlow part via a unix domain socket ( ) DataFlow    DataProcessing decoupling Node n Data Flow Data Processing EFD SFO SFI Accepted Events Incoming Events PT #1 PT #n PTIOPTIO PTIOPTIO FUNCTIONAL TESTS DEPLOYMENTCONCLUSIONSINTRODUCTIONDESIGN

6 ANDREA NEGRI, INFN PAVIA – NUCLEAR SCIENCE SYMPOSIUM – ROME 20th October 2004 6 When an event enters the processing node it is stored in a shared memory (sharedHeap) used to provide events to the PTs A PT, using the PTIO interface (socket) Requests an event Obtains a pointer to sharedHeap portion that contain the event to be processed (The PTIO maps this portion in memory) Processes the event Communicates back to the EFD the filtering decisions PT cannot corrupt the events because the map is read only Only the EFD manages the sharedHeap If the PT crashes the event is still owned by the EFD, that may assign the event to another PT or force accept it Fault Tolerance: the sharedHeap (1) Node n EFD SFO PT #1 PTIOPTIO PT #n PTIOPTIO SFI 1001110101000100100100010001010001 1110100010010100100010010100010000 1000100101010111100000101110011001 0010010010100110101010001000100010 0010010001001010001000010001001010 1011110000010111001100100100100101 0011010101000100010001010101010101 0001011110100110100111000101000111 Ev y SharedHeap Ev x Ev z RO map FUNCTIONAL TESTS DEPLOYMENTCONCLUSIONSINTRODUCTIONDESIGN

7 ANDREA NEGRI, INFN PAVIA – NUCLEAR SCIENCE SYMPOSIUM – ROME 20th October 2004 7 To provide fault tolerance also in case of EFD crash the sharedHeap is implemented as a memory mapped file The OS itself manages directly the actual write operations avoiding useless disk I/O over-heading The raw events can be recovered reloading the sharedHeap file at EFD restart The system could be out of sync only in case of power cut, OS crash or disk failure these occurrences are completely decoupled from the event types and topology and therefore do not entail physics biases on the recorded data Fault tolerance: the sharedHeap (2) Node n EFD SFO PT #1 PTIOPTIO PT #n PTIOPTIO SFI 1001110101000100100100010001010001 1110100010010100100010010100010000 1000100101010111100000101110011001 0010010010100110101010001000100010 0010010001001010001000010001001010 1011110000010111001100100100100101 0011010101000100010001010101010101 0001011110100110100111000101000111 Ev y SharedHeap Ev x Ev z FUNCTIONAL TESTS DEPLOYMENTCONCLUSIONSINTRODUCTIONDESIGN

8 ANDREA NEGRI, INFN PAVIA – NUCLEAR SCIENCE SYMPOSIUM – ROME 20th October 2004 8 Node n EFD SFO PT #1 PTIOPTIO PT #2 PTIOPTIO SFI Input Monitoring Sorting ExtPTs Output Trash SFI Input PT #3 PTIOPTIO PT #a PTIOPTIO PT #b PTIOPTIO SFO Calibration data Debugging channel Main output stream Calibration Implementation example The EFD function is divided into different specific tasks that could be dynamically interconnected to form a configurable EF dataflow network The internal dataflow is based on reference passing Only the pointer to the event (stored in the sharedHeap) flows among the different tasks Tasks that implement interfaces to external components are executed by independent threads (Multi Thread design) In order to absorb communication latencies and enhance performance Flexibility and Modularity FUNCTIONAL TESTS DEPLOYMENTCONCLUSIONSINTRODUCTIONDESIGN

9 ANDREA NEGRI, INFN PAVIA – NUCLEAR SCIENCE SYMPOSIUM – ROME 20th October 2004 9 Verified the robustness of the architecture Week long runs (>10 9 events) without crashes or event losses (even randomly killing PTs) EFD  PT communication mechanism scales with the number of running PTs SFI  EFD  SFO communication protocol Exploit gigabit links for realistic event sizes Rate limitations for small event sizes (or remote farm implementations) EFD asks for a new event only after the previous one has been received Rate limited by the round trip time Improvements under evaluation Scalability tests carried out on 230 nodes Up to: 21 subFarms, 230 EFDs, 16000 PTs 4000 3600 3200 2800 2400 Real PT Dummy PT Memory limit Quad xeon 2.5GHz, 4GB Functional Tests FUNCTIONAL TESTS DEPLOYMENTCONCLUSIONSINTRODUCTIONDESIGN

10 ANDREA NEGRI, INFN PAVIA – NUCLEAR SCIENCE SYMPOSIUM – ROME 20th October 2004 10 ATLAS Combined Test Beam FUNCTIONAL TESTS DEPLOYMENTCONCLUSIONSINTRODUCTIONDESIGN

11 ANDREA NEGRI, INFN PAVIA – NUCLEAR SCIENCE SYMPOSIUM – ROME 20th October 2004 11 pROS Local LVL2 farm Contains the LVL2 result that steers/seeds the EF processing Tracker Calo Muon monitoring run control 101010100010 001001001000 100010110 ROS LVL1calo 101010100010 001001001000 100010110 ROS LVL1mu 101010100010 001001001000 100010110 ROS RPC 101010100010 001001001000 100010110 ROS TGC 101010100010 001001001000 100010110 ROS CSC 101010100010 001001001000 100010110 ROS MDT 101010100010 001001001000 100010110 ROS Tile 101010100010 001001001000 100010110 ROS LAr 101010100010 001001001000 100010110 ROS TRT 101010100010 001001001000 100010110 ROS SCT 101010100010 001001001000 100010110 ROS Pixel Event Builder DFM SFI data network (GbE) EF farm @ Meyrin (few Km) gateway Remote Farms: Poland Canada Denmark Infrastructure tests only Test Beam Layout Local EF farm SFO FUNCTIONAL TESTS DEPLOYMENTCONCLUSIONSINTRODUCTIONDESIGN

12 ANDREA NEGRI, INFN PAVIA – NUCLEAR SCIENCE SYMPOSIUM – ROME 20th October 2004 12 Online event monitoring Online histograms obtained merging data published by different PTs and gathered by a TDAQ monitoring process (the Gatherer) Online event reconstruction E.g.: Track fitting Online event selection Beam composed of , , e Track reconstruction in muon chamber allowed the selection of  events Events labelled according to the selection and/or sent to different output streams Validation of the HLT muon slice (work in progress) Transfer LVL2 result to EF (via pROS) and decoding Steering and seeding of the EF algorithm Test Beam Online Event Processing FUNCTIONAL TESTS DEPLOYMENTCONCLUSIONSINTRODUCTIONDESIGN Presenter Main Window

13 ANDREA NEGRI, INFN PAVIA – NUCLEAR SCIENCE SYMPOSIUM – ROME 20th October 2004 13 Online event monitoring Online histograms obtained merging data published by different PTs and gathered by a TDAQ monitoring process (the Gatherer) Online event reconstruction E.g.: Track fitting Online event selection Beam composed of , , e Track reconstruction in muon chamber allowed the selection of  events Events labelled according to the selection and/or sent to different output streams Validation of the HLT muon slice (work in progress) Transfer LVL2 result to EF (via pROS) and decoding Steering and seeding of the EF algorithm Online Event Processing FUNCTIONAL TESTS DEPLOYMENTCONCLUSIONSINTRODUCTIONDESIGN  = 61  m mm Residuals of segments fit in muon chambers

14 ANDREA NEGRI, INFN PAVIA – NUCLEAR SCIENCE SYMPOSIUM – ROME 20th October 2004 14 Online event monitoring Online histograms obtained merging data published by different PTs and gathered by a TDAQ monitoring process (the Gatherer) Online event reconstruction E.g.: Track fitting Online event selection Beam composed of , , e Track reconstruction in muon chamber allowed the selection of  events Events labelled according to the selection and/or sent to different output streams Validation of the HLT muon slice (work in progress) Transfer LVL2 result to EF (via pROS) and decoding Steering and seeding of the EF algorithm Online Event Processing FUNCTIONAL TESTS DEPLOYMENTCONCLUSIONSINTRODUCTIONDESIGN Energy deposition in calo cells Hits in muon chamber

15 ANDREA NEGRI, INFN PAVIA – NUCLEAR SCIENCE SYMPOSIUM – ROME 20th October 2004 15 Online event monitoring Online histograms obtained merging data published by different PTs and gathered by a TDAQ monitoring process (the Gatherer) Online event reconstruction E.g.: Track fitting Online event selection Beam composed of , , e Track reconstruction in muon chamber allowed the selection of  events Events labelled according to the selection and/or sent to different output streams Validation of the HLT muon slice (work in progress) Transfer LVL2 result to EF (via pROS) and decoding Steering and seeding of the EF algorithm Online Event Processing pROS Local LVL2 farm ROS DFM SFI data network Local EF farm FUNCTIONAL TESTS DEPLOYMENTCONCLUSIONSINTRODUCTIONDESIGN

16 ANDREA NEGRI, INFN PAVIA – NUCLEAR SCIENCE SYMPOSIUM – ROME 20th October 2004 16 Design: EF designed to cope with the challenging on-line requirements Scalable design in order to allow dynamic hot-plug of processing resources, to follow technology trend and to allow geographically distributed implementations High level of data security and fault tolerance via decoupling between data processing and data flow functionalities and the use of memory mapped file Modularity and flexibility in order to allow different EF data-flows Functional tests: design validated on different test beds Proven design robustness, design scalability and data security mechanisms No design limitations observed Deployment on test beam setup Online event processing, reconstruction and selection Online validation of the HLT muon full sliceConclusions FUNCTIONAL TESTS DEPLOYMENTCONCLUSIONSINTRODUCTIONDESIGN


Download ppt "ANDREA NEGRI, INFN PAVIA – NUCLEAR SCIENCE SYMPOSIUM – ROME 20th October 2004 1 01 11 010 001 1101 1110 11001 01011 110110 001101 1111111 0111000 11101010."

Similar presentations


Ads by Google