Wojciech Sliwinski BE-CO-IN for the BE-CO Middleware team: Felix Ehm, Kris Kostro, Joel Lauener, Radoslaw Orecki, Ilia Yastrebov, [Andrzej Dworak] Special thanks to: Vito Baggiolini and Pierre Charrue
Agenda Context & Motivation for Renovation Middleware Review process Technical evaluation of the transport layer Changes in the MW Architecture in LS1 MW Upgrade milestones in 2013 Conclusions 2
Agenda Context & Motivation for Renovation 3
Motivations for MW Renovation Current CORBA-based CMW-RDA Integrated in the Control system Used to operate all CERN accelerators Provides widely accepted Device/Property model > 10 years old Why to review & upgrade MW ? CORBA was choosen 15 years ago Technical limitations of CORBA-based transport Functional limitations of the current CMW-RDA Codebase with long history difficult to maintain, needs architecture review Major issue of long-term support & future evolution Evolution of technology over last 10 years: HW, OS, middleware, 3rd party libraries Human factor less & less CORBA expertise on the market 4
Technical limitations of CORBA transport Became legacy, not actively supported maintenance issue Shrinking community, slow response time omniORB (C++) – 1 developer/maintainer, last release mid-2011 JacORB (Java) – few developers, small community Major technical limitations Lack of fully asynchronous processing channel Blocking communication infamous JacORB blocking issue Lack of low-level control of IO resources (sockets, request queues) Development issues Difficult to extend the wire protocol Backward compatibility issue Complex, error prone API Heavy in memory usage 5
Summary: Why change CORBA? CORBA was choosen 15 years ago Not actively maintained big risk for the MW project Better solutions exist on the market Invest in future solution rather than maintaining old one With current CORBA-based middleware we can’t solve the pending operational issues We can’t provide better scalability & reliability CMW-RDA is difficult to evolve & extend 6
Agenda Middleware Review process 7
Middleware Renovation process MW Renovation = MW Review + MW Upgrade MW Review aims to provide the most appropriate technical solution satisfying the user requirements MW Upgrade establishes the plan & strategy for introduction of the new MW Objective: LS1 the unique opportunity for the major MW upgrade Middleware Review Process Gathering of users feedback and requirements ( ) Review of communication and serialization libraries ( ) Prototyping using selected communication products (2012) Design & impl. of new RDA3: Data, Client & Server ( ) Testing & validation of core MW infrastructure (summer’13) Upgrade of all dependent MW libraries & services ( ) ○ JAPC, Directory Service, Proxy, DIP Gateway 8
Review of users requirements – series of interviews with major users Lars Jensen, Stephen Jackson (BI) Andy Butterworth, Frode Weierud, Roman Sorokoletov (RF) Brice Copy, Clara Gaspar (DIP, DIM) Frederic Bernard, Herve Milcent, Alexander Egorov (PVSS) Alexey Dubrovskiy (CTF), Kris Kostro (DIP gateways) Marine Gourber-Pace, Nicolas Hoibian (Logging) Nicolas De Metz-Noblat (Front-Ends), Alastair Bland (Infrastructure) Michel Arruat (FESA), Stephen Page (FGC) Niall Stapley, Mark Buttner, Marek Misiowiec (LASER & DIAMON) Nicolas Magnin, Christophe Chanavat (ABT) Stephane Deghaye, Jakub Wozniak (InCA, SIS) Vito Baggiolini, Roman Gorbonosov (JAPC & DA systems) + regular feedback from OP + internal team input 9
New RDA3: Accepted requirements General Java & C++ API, Win (64-bit) & Linux (SLC5 32-bit & SLC6 64-bit) Accelerator Device Model (i.e. Device/Property) Get, Set, Async-Get, Async-Set, Subscribe Early detection of communication failures Improve error reporting in all the layers: client, server, gateways Admin interface & runtime diagnostics & statistics Data support Data object: primitives, n-dim arrays, data structures Subscription mechanism Subscription behaviour the same regardless condition of the server (active, down) Several client subscription policies (default: continuous) Provide subscription notification ordering First-Update enforced via CMW on server-side ○ Provide callback to front-end framework for the server-side Get Drop support for on-change flag Standardise use of subscription filters and update flags (e.g. immediate update) Add header for acquired Data common metadata (e.g. acq. stamp, cycle name) All loss of data (dropped updates) must be notified to clients 10 New requirement
New RDA3: Accepted requirements Client side RDA3 client API connects with both: RDA2 (old) & RDA3 (new) servers Efficient mechanism for: connection, disconnection & reconnection Must be able to recover from any interruption of communication with the server ○ Server restarts, IP address change, rename/move of a device to another server Improved semantics of Array Calls, i.e. handling of individual parameters Enhanced diagnostics & collection of statistics Server side Policies for discarding notifications, i.e. deal with overflows and ’bad clients’ ○ Instrument with counters & timings allowing to diagnose the notifications delivery Prioritisation of Get/Set requests for high-priority clients Server-side subscription tree fully managed by CMW ○ Server does not need to manage client subscriptions any more Manage the client connections, e.g. forced disconnect of a client Client lifetime callbacks (i.e. connected, disconnected) 11 New requirement
New RDA3: Accepted requirements Server side (cont.) Client discovery for the diagnostics purposes (i.e. connected clients with payload) Enhanced diagnostics & collection of statistics Ongoing discussions (not accepted yet) Prioritisation of subscription notifications for high-priority clients Technical notes Invest in asynchronous & non-blocking communication Prefer 0-copy & lock-free data structures, message queues New requirement
New RDA3: Summary of requirements Unchanged Device/Property model Set of basic operations (Get, Set, Subscribe) Fixes & improvements Subscription mechanism Connection management Diagnostics & statistics New functionality Policies for subscription management (client & server) Client priorities Server-side subscription tree Extended Data support Standardise First-Update concept 13
Agenda Technical evaluation of the transport layer 14
Middleware transport requirements 15 Desirable Mandatory Fundamental Lightweight Friendly API, documentation Request/reply & pub/sub patterns Open source license Asynchronous Active community Stability, Maturity & Longevity Performance & Scalability C++/Java Linux/Windows Over TCP/IP LAN
Evaluated middleware products 16 Ice Thrift omniORB YAMI OpenSpliceDDS OpenAMQ CoreDX RTI DDS ZeroMQ QPid MQtt RSMB JacORB Mosquito All opinions are based only on our knowledge and evaluation. Each of the products, depending on the requirements, may constitute a good solution. RabbitMQ Andrzej Dworak, ICALEPCS 2011
Products comparison (according to the criteria) 17 Sync, async & msg patterns QoS Dependencies & memory f-p Performance Look & feel, API, docs Community & maturity Score ZeroMQ 6 Ice 5 YAMI4 4 RTI 3 Qpid 3 CORBA 2 Thrift 2 Andrzej Dworak, ICALEPCS 2011
Conclusions Several good middleware solutions available The choice is dictated by the most critical requirements Not easy performance matters but also ease of use, community, … Prototyping was done with the most promising candidates: ZeroMQ, Ice & YAMI Finally we decided to choose ZeroMQ ( Asynchronous & non-blocking communication 0-copy & lock-free data structures, message queues Nice API, good documentation & active community 18
New RDA3 Java – Sync Get round-trip time 19 Test setup: 1kB message payload, cs-ccr-* machines, 1 server host & 10 client hosts
New RDA3 Java – subscription notification latency 20 Test setup: 1kB message payload, cs-ccr-* machines, 1 server host & 10 client hosts
New RDA3 Java – subscription notification latency 21 Test setup: 1kB message payload, cs-ccr-* machines, 1 server host & 10 client hosts
Agenda Changes in the MW Architecture in LS1 22
Current MW Architecture User written Middleware Central services Physical Devices (BI, BT, CRYO, COLL, QPS, PC, RF, VAC, …) Java Control Programs RDA Client API (C++/Java) Device/Property Model DirectoryService ConfigurationDatabaseCCDB VB, Excel, LabView Servers Clients Virtual Devices (Java) PS-GM Server FESA Server FGC Server PVSS Gateway C++ Programs More Servers Administration console Passerelle C++ CMW Infrastructure CORBA-IIOP RDA Server API (C++/Java) Device/Property Model RBAC A1 ServiceDirectoryServiceRBACService JAPC API CMW integr.CMW int. 23
Changes in MW Architecture in LS1 User written Middleware Central services Physical Devices (BI, BT, CRYO, COLL, QPS, PC, RF, VAC, …) Java Control Programs RDA Client API (C++/Java) Device/Property Model DirectoryService ConfigurationDatabaseCCDB VB, Excel, LabView Servers Clients Virtual Devices (Java) PS-GM Server FESA Server FGC Server PVSS Gateway C++ Programs More Servers Administration console Passerelle C++ CMW Infrastructure ZeroMQ RDA Server API (C++/Java) Device/Property Model RBAC A1 ServiceDirectoryServiceRBACService JAPC API CMW integr.CMW int. Upgrade in LS1 24
Agenda MW Upgrade milestones in
MW Upgrade Milestones in 2013 MilestoneCompleted by ? RDA3 Java (client/server) (alpha)June’13 RDA3 C++ server (alpha)July’13 RDA3 integration with: FESA, FGC, PVSSJuly-Oct’13 RDA3 C++/Java (client/server) validatedSeptember’13 New JAPC release with RDA3 JavaSeptember’13 New FESA3.2 release with RDA3December’13 26 RDA3 C++ Integration with FESA, FGC, PVSS RDA3 validated New JAPC New FESA3.2 Tests with eqp. End LS1 July’13July-Oct’13September’13Winter’13/14August’14December’13 End-of-Life for RDA2: LS2
MW Upgrade strategy in LS1 and towards LS2 No BIG-BANG migration but gradual Backward compatible (connection-wise) new RDA3 client library New RDA3 clients can communicate with RDA2 & RDA3 servers FESA3 will exist with both: old RDA2 (FESA3.1) and new RDA3 (FESA3.2) 27 Old JAPC Old RDA2 server FESA2.10FESA3.1 Old RDA2 server New RDA3 server FESA3.2 Old RDA2 client New JAPC New RDA3 client RDA2 RDA3 Gateway Client apps will migrate during LS1 Only for justified, exceptional cases FEC developers should migrate to FESA3.2 ASAP
Conclusions We have to replace CORBA with a new solution We collected updated users requirements MW upgrade will be performed during LS1 ( ) Interoperability between RDA2 RDA3 Gradual control system migration until LS2 (end-2017) End-of-Life for RDA2: LS2 28