PowerPC / Intel Benchmark Reflective Memory Recorder Upgrade: an opportunity to benchmark PowerPC and Intel architectures for real time. Roberto Abutera, Helmut Tischera, Robert Frahma aEuropean Southern Observatory, 07/11/2018 PowerPC / Intel Benchmark
PowerPC / Intel Benchmark History & Status mvme167 SBC based on 68k MPC68040 25 MHz (1979) mvme2600 SBC based on MPC604 200 MHz (1994) mvme2700 SBC based on the MPC750 366 MHz (1997) mvme6100 SBC based on the MPC7457 1.3 GHz ( 2001) Used on ~300 computers Small fraction hard real time requirements, rest pseudo-real time 07/11/2018 PowerPC / Intel Benchmark
Why do we need new/more powerful CPUs In fast control loops : Reduce latency & pure computational delays => dramatically improve control system bandwidth Simplify software architecture ( less asynchronous, less distribution, less communication requirements, etc ) And in general: Confront imminent obsolescence. Improve heat dissipation per unit of processing (6100 .aka. VME toaster) Improve systems response time Support , newer, more demanding applications ( wavefront control, huge detectors arrays, interferometry, etc) 07/11/2018 PowerPC / Intel Benchmark
Why do we need new more powerful CPUs Dramatic effect of computational delay in control loops. 07/11/2018 PowerPC / Intel Benchmark
PowerPC / Intel Benchmark Benchmark Hardware PowerPC MVME6100, MPC7457 PowerPC, 1.267 GHz, VME, 0.5 GB RAM Gigabit Ethernet. GE Reflective Memory PMC-5565 Cost : ~6000 Euros Intel MultiCore: Intel Core i7 E5-1410 0, 2.8 GHz, 4 Cores, PCIExpress,12 GB RAM Cost : ~1400 Euros 07/11/2018 PowerPC / Intel Benchmark
Benchmark Application VLTI Reflective Memory Network Recorder Based on a VxWorks real time operating system. Based on the standard VLT instrument software framework. Sequence of DMA read access to sparse pieces of reflective memory. Streaming collected data up to 0.5 Gigabit/second. Tasks : trfmCtrl : DMA read Reflective Memory into the local memory. trfmXfer : Transfer data from memory to remote system via TCP/IP. tNet0 : system network stack task. trfmMon: background task running a 1 Hz. 07/11/2018 PowerPC / Intel Benchmark
PowerPC / Intel Benchmark 07/11/2018 PowerPC / Intel Benchmark
RMN recorder configurations Fragmented Configuration 10 KHz leading to a transfer rate of 68.3 Mbit/sec. 07/11/2018 PowerPC / Intel Benchmark
RMN recorder configurations Big Data 10 KHz leading to a transfer rate of 290 Mbit/sec. 07/11/2018 PowerPC / Intel Benchmark
PowerPC / Intel Benchmark Cycles to benchmark Read-in cycle ( Control task) Awake by interrupt of system clock Sequential DMA accesses. Few µsec available ( cycle frequency ) Can’t be delayed beyond µsec budget without losing samples. Higher priority Transfer cycle ( Transfer + Network tasks) : Awake when read-in cycles buffers are filled ( .i.e. 3 seconds) TCP/IP socket data transfer Can be delayed within 3 seconds as long as the full buffer is transmitted in the available 3 seconds. Lower priority 07/11/2018 PowerPC / Intel Benchmark
Read-in cycle - PowerPC - 10KHz - Fragmented 07/11/2018 PowerPC / Intel Benchmark
Read-in cycle - Intel (1 core) 10KHz - Fragmented 07/11/2018 PowerPC / Intel Benchmark
Transfer cycle – PowerPC - 10KHz - Fragmented 07/11/2018 PowerPC / Intel Benchmark
Transfer cycle – Intel (1 core) - 10KHz - Fragmented 07/11/2018 PowerPC / Intel Benchmark
Read-in cycle – PowerPC - 16KHz - Fragmented Read-in cycle ( Control task) unable to execute in 62 µsec leads to : Loose of data or Crash of the computer. Possibilities to overcome this : Redesign the map of the reflective memory to reduce fragmentation => high development and integration costs .i.e. affects many systems. Reprogram the reflective memory driver to have chained DMA access=>high development and complex maintenance 07/11/2018 PowerPC / Intel Benchmark
Read in cycle – Intel (1 core) - 16KHz – Big Data 07/11/2018 PowerPC / Intel Benchmark
Read in + Transfer cycle – Intel MultiCore - 16KHz – Big Data 07/11/2018 PowerPC / Intel Benchmark
PowerPC / Intel Benchmark % Idle Time 16 KHz on PowerPC not possible. MultiCore , on both configurations and frequencies remains mainly IDLE. 07/11/2018 PowerPC / Intel Benchmark
PowerPC / Intel Benchmark Conclusions RMN recorder range of operation can be extended to at least 16 KHz by moving to Intel MultiCore. As expected, VxWorks isolates application software from hardware architecture. Low migration cost. Performance scales with available CPU power and I/O bandwidth. Applications access to VME bus very slow. Cost per CPU power unit is dramatically reduced moving from PowerPC to Intel MultiCore. 07/11/2018 PowerPC / Intel Benchmark