Normal text - click to edit FeeCom software during TPC commissioning (Benchmarks) Sebastian Bablok Dag Toppe Larsen Matthias Richter Benjamin Schockert Department of Physics and Technology, University of Bergen, Norway Center for Telecommunication and Technology Transfer, University of Applied Science Worms, Germany
Normal text - click to edit TOC TPC commissioning DCS –FEE part Setup overview Observations Conclusion Benchmarks during commissioning results remarks Future plans
Normal text - click to edit Front-End-Electronics in DCS Control and monitor channels Cmd / ACK Channel Service Channel Message Channel FED Server FEE Client InterComLayer FeeServer PVSS II (FED - Client) FeeServer Supervisory Layer Control Layer Field Layer Front-End Device Interface (FED) Front-End Electronics Interface (FEE) Hardware Device Hardware Device Hardware Device Internal Bus Systems Load configuration data from file OR database Config. DB Config. File
Normal text - click to edit Schematically layout for commissioning Switch tpcfee01 (ICL) tpcfee02 (Test-FedClient) PVSS (incl. FedClient) 6 DCS boards (FeeServer incl. TPC CE) 100MBit/s 10MBit/s 100MBit/s External networkInternal network
Normal text - click to edit DCS network setup Based on standard protcols/tools: DHCP, DNS, NFS DCS boards on private network 10.x.x.x.feenet used as local TLD Board number used for MAC and IP addresses (24 LSB) and hostname- alias (dcs.feenet) Gateway running ICL provides communication with outside world Hostname in format tpc-fee_x_yy_z.feenet, dcs.feenet as alias FeeServer name set from hostname FeeServer stored on and run from external NFS share Logs written to NFS share
Normal text - click to edit DCS bootup MAC address set to board number DCS board sends MAC address to DHCP server, requesting IP address and hostname DHCP server looks up IP address for MAC address, then queries Domain Name Server for hostname matching IP-address DHCP server returns IP configuration and hostname to DCS board DCS board mounts two NFS shares – one RO and one RW Boot-script run from RO shared directory May start update scripts Starts FeeServer with hostname as FeeServer name and logs outputed to RW share
Normal text - click to edit Cables DCS-side: Uses non-standard connector without any locking May easily fall out Connectors are glued together, cable attached to cooling plate using cable ties Switch-side: Standard ethernet connector Connectors not well made/attached, bad contact Had to be re-crimped Are still sensible to twisting when plugged into switch/patch panel
Normal text - click to edit Network problems during commissioning Some boards were unreachable via the network: 90% packet drop Switch indicated 100Mb/s – not 10 as expected Most boards affected, but some always, some rarely However: a short power cycle seemed to help? Turned out there was a bug in the kernel driver: autonegitiation not always enabled on boot Ethernet interface switched to 100Mb/s operation The electronics between ethernet chip and cable on DCS board does not support this because of modifications due to the strong magnetic field Only a few packets got through After kernel update, problems gone
Normal text - click to edit Temperature measurements All FECs have temperature sensors –If temperature too high electronics may be damaged –The FeeServer will export temperatures to higher layers –High temperatures will cause electronics to be switched off During commissioning temperature was written continuously to log files –A temperature cross section for each partition was plotted for every 12th hour –No alarming temperatures were seen
Normal text - click to edit Software Mostly OK InterComLayer/FeeServers interplay is working FeeServers sometimes “disappear” from DID, but not from ICL. It seems like they are running, but not in a working state FeeServers sometimes do not publish services – registration timeout FeeServers crashes (and restarts) when FECs are turned on and off via DDL The kernel update took care of most other problems (“impossible” to get all DCS boards running without “dirty tricks”)
Normal text - click to edit Commissioning conclusion Network based configuration worked as planed Some initial network problems, OK after kernel update No alarming electronics temperatures seen Some minor FeeServer issues Ethernet cables must be handled with care
Normal text - click to edit Benchmarks during TPC commissioning Benchmark done with one patch and a complete slice of the TPC Benchmark test performed on TPC side 0 (a), slice 13 (single cast on patch 0) Setup: 6 FeeServer with TPC ControlEngine (CE) Switch: NETGEAR 7300S Series Layer 3 Managed Switch InterComLayer on P4 (3.4GHz, dual core, 512 MB RAM, SLC 3) FedClient implementation for testing purpose on different machine
Normal text - click to edit Setup during commissioning and benchmark tests Switch tpcfee01 (ICL) tpcfee02 (Test-FedClient) PVSS (incl. FedClient) 6 DCS boards (FeeServer incl. TPC CE) 100MBit/s 10MBit/s 100MBit/s
Normal text - click to edit Components used during benchmark Cmd / ACK Channel FED Server FEE Client InterComLayer PVSS II (FED - Client) FeeServer / CE Supervisory Layer Control Layer Field Layer Front-End Device Interface (FED) Front-End Electronics Interface (FEE) Load configuration data from file Config. File FeeServer / CE
Normal text - click to edit Benchmarks layout Issued command: Switching on / off of all Front-End-Cards of the patch command size: 12 Byte (+ 12 Byte of FeePacket header = 24 Byte) CE was emulating the execution of “switch on/off FEC” command Send as: Singlecast and Broadcast for a complete slice from Test-FedClient and from PVSS
Normal text - click to edit Benchmark results during TPC commissioning SingleCast ControlFero command: time period for [sec]averagemaxmin Command in FedServer – ACK in FeeClient SEND – ACK in FeeClient Process time in ICL FeeServer computing Annotations: command issued 100 times no lost ACKs
Normal text - click to edit Benchmark results during TPC commissioning BroadCast ControlFero command (FedServer – Ack in FeeClient): [sec]allpatch0patch1patch2patch3patch4patch5 average max min count Annotations: command issued 96 times, lost ACKs: 21 (for missing already FeeServer no command had been issued)
Normal text - click to edit Benchmark results during TPC commissioning FeeServer/CE benchmark (receive command – send ACK): patch0patch1patch2patch3patch4patch5 average [sec] max [sec] min [sec] 0.02 seg faults duplicated ACKs counts Annotations: command issued 100 times, duplicated ACKs may indicate temporarily lost links to ICL and/or DIM-DNS
Normal text - click to edit Remarks to Benchmark tests ACKs very delayed very few ACK reached at the FeeClient after the ACK of the following Command has already been received take over of ACK not possible in FeeServer and DIM framework most likely package temporarily stuck in switch duplicated ACKs most likely due to lost link to FeeServer, DIM-DNS should not disturb the system, filtered out by InterComLayer
Normal text - click to edit Future Tests Extended tests with more slices: 2, 9, 18 (one side), 36 (whole TPC, both sides) preparing a complete set of benchmark test when TPC is available again in May 2007 Test with real commands, real configuration data and real execution in CE Benchmarks of the Service Channels (fast triggered update of temp, etc.) (usage of the CommandCoder during tests) further investigation of delayed ACKs verify that duplicated ACKs will not disturb the system
Normal text - click to edit