EPICS Archiving Appliance Test at ESS

Slides:



Advertisements
Similar presentations
Storing Data: Disks and Files: Chapter 9
Advertisements

OPNET Technologies, Inc. Performance versus Cost in a Cloud Computing Environment Yiping Ding OPNET Technologies, Inc. © 2009 OPNET Technologies, Inc.
Out-of core Streamline Generation Using Flow-guided File Layout Chun-Ming Chen 788 Project 1.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 External Sorting Chapter 11.
1 External Sorting Chapter Why Sort?  A classic problem in computer science!  Data requested in sorted order  e.g., find students in increasing.
External Sorting CS634 Lecture 10, Mar 5, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
CS 501: Software Engineering Fall 2000 Lecture 19 Performance of Computer Systems.
FALL 2006CENG 351 Data Management and File Structures1 External Sorting.
CPSC 231 Sorting Large Files (D.H.)1 LEARNING OBJECTIVES Sorting of large files –merge sort –performance of merge sort –multi-step merge sort.
1 Storage Hierarchy Cache Main Memory Virtual Memory File System Tertiary Storage Programs DBMS Capacity & Cost Secondary Storage.
NLC - The Next Linear Collider Project Lee Ann Yasukawa 05/25/99 NLC Archiving Requirements (Preliminary)
CPSC-608 Database Systems Fall 2009 Instructor: Jianer Chen Office: HRBB 309B Phone: Notes #5.
Using Secondary Storage Effectively In most studies of algorithms, one assumes the "RAM model“: –The data is in main memory, –Access to any item of data.
1 External Sorting Chapter Why Sort?  A classic problem in computer science!  Data requested in sorted order  e.g., find students in increasing.
1 CS 501 Spring 2005 CS 501: Software Engineering Lecture 22 Performance of Computer Systems.
CPSC-608 Database Systems Fall 2010 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #5.
Storing Data. Memory vs. Storage Storage devices are like file drawers, in that they hold programs and data. Programs and data are stored in units called.
Control and monitoring of on-line trigger algorithms using a SCADA system Eric van Herwijnen Wednesday 15 th February 2006.
Optimizing RAM-latency Dominated Applications
1 Proxy-Assisted Techniques for Delivering Continuous Multimedia Streams Lixin Gao, Zhi-Li Zhang, and Don Towsley.
Report : Zhen Ming Wu 2008 IEEE 9th Grid Computing Conference.
Loris Giovannini, Mauro Giacchini Epics Collaboration Meeting
Lecture 11: DMBS Internals
Computers Central Processor Unit. Basic Computer System MAIN MEMORY ALUCNTL..... BUS CONTROLLER Processor I/O moduleInterconnections BUS Memory.
 Design model for a computer  Named after John von Neuman  Instructions that tell the computer what to do are stored in memory  Stored program Memory.
Controls Murali Shankar Luofeng Li Mike Zelazny Archiver Appliance Report Fall 2012.
Profiling Grid Data Transfer Protocols and Servers George Kola, Tevfik Kosar and Miron Livny University of Wisconsin-Madison USA.
Types of Computers Mainframe/Server Two Dual-Core Intel ® Xeon ® Processors 5140 Multi user access Large amount of RAM ( 48GB) and Backing Storage Desktop.
Boosting Event Building Performance Using Infiniband FDR for CMS Upgrade Andrew Forrest – CERN (PH/CMD) Technology and Instrumentation in Particle Physics.
Update on a New EPICS Archiver Kay Kasemir and Leo R. Dalesio 09/27/99.
Sorting.
Inside your computer. Hardware Review Motherboard Processor / CPU Bus Bios chip Memory Hard drive Video Card Sound Card Monitor/printer Ports.
CPSC 404, Laks V.S. Lakshmanan1 External Sorting Chapter 13: Ramakrishnan & Gherke and Chapter 2.3: Garcia-Molina et al.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
High Speed Detectors at Diamond Nick Rees. A few words about HDF5 PSI and Dectris held a workshop in May 2012 which identified issues with HDF5: –HDF5.
CPSC-608 Database Systems Fall 2015 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #5.
CS 101 – Sept. 28 Main vs. secondary memory Examples of secondary storage –Disk (direct access) Various types Disk geometry –Flash memory (random access)
1 CS 501 Spring 2003 CS 501: Software Engineering Lecture 23 Performance of Computer Systems.
11.1 Silberschatz, Galvin and Gagne ©2005 Operating System Principles 11.5 Free-Space Management Bit vector (n blocks) … 012n-1 bit[i] =  1  block[i]
Introduction to Database Systems1 External Sorting Query Processing: Topic 0.
DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 External Sorting Chapter 11.
External Sorting. Why Sort? A classic problem in computer science! Data requested in sorted order –e.g., find students in increasing gpa order Sorting.
Wir schaffen Wissen – heute für morgen Paul Scherrer Institut Timo Korhonen Improvements to Indexing Tool (Channel Archiver) ‏ EPICS Meeting, BNL 2010.
Computer Performance. Hard Drive - HDD Stores your files, programs, and information. If it gets full, you can’t save any more. Measured in bytes (KB,
Implementation and Testing of RDB Channel Archiver with MySQL Richard Ma, DePauw University Supervisor: Richard Farnsworth, Argonne National Laboratory.
ESS Integrated Control System Software Core Components S.Gysin
LHC RT feedback(s) CO Viewpoint Kris Kostro, AB/CO/FC.
1 Lecture 16: Data Storage Wednesday, November 6, 2006.
SQL Server Internals 101 AYMAN SENIOR MICROSOFT.
Software Core Components (ICS WP3) Suzanne Gysin Work Package Lead February 22, 2014.
Getting the Most out of Scientific Computing Resources
Getting the Most out of Scientific Computing Resources
Memory Miss Elliott.
WP18, High-speed data recording Krzysztof Wrona, European XFEL
Operating System.
Distributed Network Traffic Feature Extraction for a Real-time IDS
Lecture 16: Data Storage Wednesday, November 6, 2006.
Operating Systems ECE344 Lecture 11: SSD Ding Yuan
Lecture 11: DMBS Internals
Data Structures and Algorithms
Database Management Systems (CS 564)
Types of Computers Mainframe/Server
SAP HANA Cost-optimized Hardware for Non-Production
External Sorting.
CENG 351 Data Management and File Structures
European Spallation Source Archiving Service
Presentation transcript:

EPICS Archiving Appliance Test at ESS J. Bobnar, S.Gysin www.europeanspallationsource.se November 25, 2014

Goal Asses the feasibility of the EPICS Archive Appliance (AA) for European Spallation Source. Measure performance and compare to requirements Propose new features for the services http://epicsarchiverap.sourceforge.net/

Requirements: Capacity Planning Description # records records archived bytes/ record record/ sec bytes/sec GB/day Rack estimation (ESS Bilbao Ion source) 28,400 2,840 14.3 1.00 40,612 3.31 SNS (BEAUtY) 340,000 85,000 30 0.02 52,298 4.21 FRIB (estimates) 200,000 8 0.20 320000 26 SLAC : Archive appliance test : test-arch 102,255 0.03 80,406 6.47 Jaka: Medical Accelerator (BEAUtY) 150,000 0.22 994,205 80 LHC logging (MDB) 3,625,990 292 For ESS we decided to double the capacity of SNS: Description # records records archived bytes/record record/sec GB/day SNS 340,000 85,000 30 0.02 4.21 ESS (2x SNS) 680,000 170,000 8.42

But … there will be spikes in the rate the data is archived Waveforms are significantly larger (~5kB/record) Post Mortem buffers: ~15 GB/beam stop 1 beam stop/hour = 24 beam stops/day = 360 GB/day (commissioning) Data on demand 10 event/day 1000 channels ~2MB per channel per event = 20 GB/day EPICS V4 data types

Short, Medium and Long term archiving Examples: SLAC Archiver Appliance: 1 hour, 1 day, 1 year FRIB planned: 1 week, 1 month, forever LHC – Timber Logging System: MDB: 7 days, LDB > 20 years SNS Archiving Service: no division DESY: 1 month, forever ESS requirements: Short term: 10 days (8.4 GB/day) Medium term: 100 days (20% of short term = 1.9 GB/day) Long term: forever (20 % of medium term = 0.19 GB/day)

Rate of retrieval Depends on Retrieval from short term storage The archive rate Reduction algorithm Number of clients simultaneously reading data Hardware Retrieval from short term storage Not slower than 1000 points/sec

Test setup 2 dedicated machines on a dedicated network, both running CODAC version of the Scientific Linux 4.3 Archive Appliance computer: Intel Xeon 8 core (16 threads) CPU, 16 GB RAM Solid State Drive Performance: ~240 MB/s for reading (random) and ~280MB/s for writing (sequential) ESS Control Box with IOC 30000 scalar double-type PVs 200 waveform (aSub) long-type PVs of length 1000 Both at 10 Hz. Units: “number of samples per second” N/s = number of PVs * 10 Hz

Test results: Scalars, JVM needs optimal setup Adaptive heap memory (-Xms < -Xmx) 20 000 N/s -> all is well 30 000 N/s -> event drop rate 0.04% > 30 000 N/s -> higher drop rate performance degrader: management of the Java Heap Memory size by the virtual machine (CPU was at 100 % all the time) Fixed heap size (8 GB for the engine): 100.000 N/s without a problem

Test results: Scalars Saving 10 seconds worth of data (1M samples) With ETL running (transfer between short and medium term storage) Between 8 and 11 seconds Probable Cause: The same physical drive was used for the short and medium term storage

Test results: Scalars Increased the sampling rate to 300,000 N/s Saving 10 seconds worth of data (3 M samples) 3.5 and 4 seconds However: Event drops at start up With ETL running, time increased by an order of magnitude, and drop rate was very high. CPU time remained the same IO seems to be the bottle neck

Test results: wave forms 200 PVs of length 1000 at 10 Hz 2000 N/s, 1N ≈ 8kB Saving 10 seconds worth of data 200 and 300 milliseconds When ETL was running the time increased to 1 sec Archiving the same amount of data but in a waveform is 15 times faster than in scalar PVs -> number of PVs matter.

Test results: rate of retrieval scalars Data stored: 100.000 N/s 8 hours 54 GB Short term: 2 files for the last hour Medium term: 1 file for the rest Retrieval rate: Short intervals (minutes; less than 800 data points available) 100 – 150 ms Longer intervals (hours; more than 800 data points available) 200 – 400 ms Even longer intervals (1 day, 2 days) 700 – 800 ms, ~1500 ms No problems with large number of PVs (file fragmentation)

Test results: rate of retrieval waveforms Retrieval rate: 1 hour interval (reduction: 36000 -> 800 samples) ~ 3500 ms Every additional hour adds approximately 3000 ms 1 day interval (reduction: 864000 -> 800 samples) > 1 min Room for improvement in reduction algorithm and in the client More tests planned with longer acquisition period.

Conclusion SNS archives 0.02 samples per second per PV. At 80.000 archived PVs that means 1600 N/s. One EPICS Archiver Appliance: can archive 100.000 N/s which is 60-times more. To reduce retrieval time we recommend running several instances of AA and distribute the PVs among them The retrieval rate (for scalars) is good and meets the requirements: for most common time interval (i.e. 1 day or less) < 1 second. We also have a list of recommendation for AA and for the AA users. To be published after completion of the tests.