EPICS Channel History Storage

Slides:



Advertisements
Similar presentations
Managed by UT-Battelle for the Department of Energy Best Ever Archive Utility, Yet (BEAUtY) Kay Kasemir April 2013.
Advertisements

Control System Studio (CSS)
SOFTWARE PRESENTATION ODMS (OPEN SOURCE DOCUMENT MANAGEMENT SYSTEM)
Archive Systems What you always wanted to know but were afraid to ask: What’s available? Who’s doing what? PAL EPICS Meeting Oct
15 Chapter 15 Web Database Development Database Systems: Design, Implementation, and Management, Fifth Edition, Rob and Coronel.
Software Frameworks for Acquisition and Control European PhD – 2009 Horácio Fernandes.
1 of 7 This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT. © 2007 Microsoft Corporation.
NovaBACKUP 10 xSP Technical Training By: Nathan Fouarge
Chapter 9 Collecting Data with Forms. A form on a web page consists of form objects such as text boxes or radio buttons into which users type information.
1 Introduction to Web Development. Web Basics The Web consists of computers on the Internet connected to each other in a specific way Used in all levels.
This presentation will guide you though the initial stages of installation, through to producing your first report Click your mouse to advance the presentation.
Loris Giovannini, Mauro Giacchini Epics Collaboration Meeting
Channel Archiver Stats & Problems Kay Kasemir, Greg Lawson, Jeff Patton Presented by Xiaosong Geng (ORNL/SNS) March 2008.
Basics of Web Databases With the advent of Web database technology, Web pages are no longer static, but dynamic with connection to a back-end database.
Tutorial 1 Getting Started with Adobe Dreamweaver CS3
10-1 aslkjdhfalskhjfgalsdkfhalskdhjfglaskdhjflaskdhjfglaksjdhflakshflaksdhjfglaksjhflaksjhf.
Controls Murali Shankar Luofeng Li Mike Zelazny Archiver Appliance Report Fall 2012.
Stanford Linear Accelerator Center November 15, 2000Lee Ann Yasukawa1 Archive Data to ORACLE The Prototype PEPII model.
ASP.NET Web Application and Development Digital Media Department Unit Credit Value : 4 Essential Learning time : 120 hours Digital.
1 Working with MS SQL Server Textbook Chapter 14.
© FPT SOFTWARE – TRAINING MATERIAL – Internal use 04e-BM/NS/HDCV/FSOFT v2/3 Working with MSSQL Server Code:G0-C# Version: 1.0 Author: Pham Trung Hai CTD.
Yokogawa Electric Corporation Copyright © Yokogawa Electric Corporation Release 2.10 Functionality Overview September 2004.
Stanford Linear Accelerator Center R. Hall/L. Yasukawa1 EPICS Collaboration Mtg May 21, 2002 Oracle Storage for the Channel Archiver Managing Channel Archiver.
Managed by UT-Battelle for the Department of Energy Kay Kasemir ORNL/SNS Oct EPICS Meeting, PAL, Korea Control System Studio Training.
Control System Studio (CSS) Overview Kay Kasemir, July 2009.
Stanford Linear Accelerator Center R. D. Hall1 EPICS Collaboration Mtg Oct , 2007 Oracle Archiver Past Experience Lessons Learned for Future EPICS.
Copyright © Yokogawa Electric Corporation Release 2.10 Functionality Overview September 2004.
XFEL The European X-Ray Laser Project X-Ray Free-Electron Laser Matthias Clausen, DESY XFEL Refrigerator Controls – April CSS Core Applications.
Managed by UT-Battelle for the Department of Energy Kay Kasemir ORNL/SNS 2012, January 9-12 at NSRRC, Taiwan Control System Studio Training.
ERDDAP The Next Generation of Data Servers Bob Simons DOC / NOAA / NMFS / SWFSC / ERD Monterey, CA Disclaimer: The opinions expressed.
Applications Kay Kasemir ORNL/SNS Using Information and pictures from Matthias Clausen, Jan Hatje, and Helge Rickens (DESY) October 2007.
Distributed Logging Facility Castor External Operation Workshop, CERN, November 14th 2006 Dennis Waldron CERN / IT.
Monitoring with InfluxDB & Grafana
CEG 2400 FALL 2012 Windows Servers Network Operating Systems.
1 Working with MS SQL Server Beginning ASP.NET in C# and VB Chapter 12.
Log Shipping, Mirroring, Replication and Clustering Which should I use? That depends on a few questions we must ask the user. We will go over these questions.
Exploring Networked Data and Data Stores Lesson 3.
1 Copyright © 2005, Oracle. All rights reserved. Oracle Database Administration: Overview.
ETL Validator Deployment Options
Fundamental of Databases
Understanding and Improving Server Performance
Shuei YAMADA (KEK / J-PARC)
Agenda:- DevOps Tools Chef Jenkins Puppet Apache Ant Apache Maven Logstash Docker New Relic Gradle Git.
Internet Made Easy! Make sure all your information is always up to date and instantly available to all your clients.
Getting Started with Application Software
What are they? The Package Repository Client is a set of Tcl scripts that are capable of locating, downloading, and installing packages for both Tcl and.
The Client-Server Model
CS522 Advanced database Systems
CS 540 Database Management Systems
Database System Concepts and Architecture
Lead SQL BankofAmerica Blog: SQLHarry.com
WinCC OA NextGen Archiver: OSS Database selection process Dipl. -Ing
BASIC INFORMATION ABOUT DATABASE MANAGEMENT SOFTWARE
CSE-291 (Cloud Computing) Fall 2016
Regional Architecture Development for Intelligent Transportation
Objectives Differentiate between the different editions of Windows Server 2003 Explain Windows Server 2003 network models and server roles Identify concepts.
Next Generation SSIS Tasks and data Connection Series
Intro to PHP & Variables
Databases.
AWS DevOps Engineer - Professional dumps.html Exam Code Exam Name.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Next Gen: Campus Collaboration
Control System Studio (CSS)
Cloud Web Filtering Platform
Current State - and Replacement
TN19-TCI: Integration and API management using TIBCO Cloud™ Integration
Lecture 4: File-System Interface
CUSTOMER RETENTION RATE
Database System Concepts and Architecture
The Database World of Azure
Presentation transcript:

Using InfluxDB for Control System Data Storage and Retrieval Megan Grodowitz, Kay Kasemir May 2017 EPICS Meeting KURRI, Japan

EPICS Channel History Storage Requirements (Needs) Reliable, Available, Maintainable Compatible with multiple languages Long term data storage Performant enough… Requirements (Wants) Much better performance than current minimal requirement Better user experience from clients reading data Room to add many more channels to archives than we are currently doing Ease of use (installation, querying) External tools availability

History of Control System Data Storage at SNS In the beginning: Channel Archiver (2000-2009) Custom solution -> fast C++ access only, unmaintainable file structure Oracle RDB (2009-present) Very reliable Slow Quirks Picked TIMESTAMP w/o TIMEZONE (hard to correct w/ existing data) Inefficient SQL to obtain sample at-or-before start time Purging time ranges requires special consideration in data layout

InfluxDB https://www.influxdata.com/ Time series database NoSQL Each sample in a series is identified by a unique timestamp REST API interface with query language Reachable from any language or script that can make HTTP requests Responses in JSON format Designed to be one component in a set of interoperable services Does not try to do everything in one package. Just stores and retrieves time series data as efficiently as possible Use cases: IoT: gather data from sensors and make decisions from data analysis https://spiio.com/ (plant irrigation monitoring and planning) http://www.bboxx.co.uk/ (solar energy devices deployed to remote locales) Datacenter monitoring Replace tools like Nagios/Elasticsearch to keep large numbers of computer systems up and running

InfluxDB Database concepts Measurements  ‘PV’, ‘Channel’ Sort of like an RDB table, top level structure Example measurement: Temperature Tags How data is indexed and searched, always a string key/value pair Example tag: location=Northwest corridor Fields What data is stored, string key with value of long, double, bool, or string type Example field: degrees=78.8 Retention Policy How long to keep data, and what to do with it as it ages Example retention policy: Create retention policy “one_day_by_hour” duration 1d shard duration 1h Hold onto this data for one day, make a new shard of data for each hour Series A unique set of measurement + tags + retention policy Contains samples with unique time stamps and arbitrary field data Tags should not have a large number of distinct values Fields are values that are being monitored and take on many different values Retention policies can be used to set up continuous downsampling, since values are searchable by retention policy name The larger the number of series, the more hardware resources are required

Custom RDB InfluxDB Reliability, Uptime DIY High High? Speed Fast Slow Access from Java, C, Python, .. Yes Web Access Query Language Data Retention, Decimation DIY partitioning can help Patch existing Samples Insert/Remove older Samples Rename Channel Same as copying samples Store Images, large Waveforms No Used beyond ‘EPICS Archive’

Archive Engine using InfluxDB PV RDB Config RDB Writer PV XML Config Influx Writer RDB XML InfluxDB

Sources https://github.com/ghmegan/influxdb-java InfluxDB library for Java https://github.com/ghmegan/archive-influxdb archive.config.xml, archive.writer.influx and .reader.influx https://github.com/ControlSystemStudio/cs-studio CS-Studio branch “influxdb-archive-app” https://github.com/ControlSystemStudio/org.csstudio.sns Archive Engine Binaries on http://ctrl-ci.sns.gov/snapshot/css-nightly .. about to be merged into main cs-studio repository..

EPICS-Specific Storage in InfluxDB Two databases ‘data’: samples with values stored in fields labeled by type and index A typical EPICS PV (double type) uses one field, called “double.0”, along with tag values for status, severity, NaN flag An array PV uses fields with names “double.0”, “double.1”, etc… and can change array size dynamically for each sample ‘meta’: logs a new entry each time the PV type changes Typical case, the metadata store contains one entry the first time the PV was logged – tracks the initial date the PV was added to the archive Suppose a PV changes from a double to string type, just add a new metadata entry when it changes, and the old samples can be maintained without any changes

Free/Open source InfluxDB instance for testing 2 databases in use for our test system Metadata timestamp shows this PV was added at 4pm GMT on 4/24 Select earliest five samples stored Select most recent 5 samples stored PV value in double.0 is a field, and takes on many different values. Severity and status are tags, few values, used as indexes. Tags are not stored over and over again for the same value

EPICS PV Storage: Speed Tests RDB Setup: PostgreSQL 9 Intel Core Duo 3GHz, Windows 7 250gb 7200 RPM SATA Disk InfluxDB Setup: InfluxDB ver. 1.1 Intel i7 3.6GHz, RHEL 7 500GB 7200 RPM SATA Disk Read Test Setup Read PV values from a given time range, up to a max of 1 million values Includes initial sample, i.e. last sample at-or-before requested start time Reading this initial sample can at times add a large delay to reading from RDB (*) Write Test Setup Write as many samples as possible in 60 seconds RDB InfluxDB Read ~96K samples/sec (*) ~353K samples/sec Write rewriteBatchedStatements=false: ~7K samples/sec rewriteBatchedStatements=true: ~21K samples/sec flushCount=10K (recommended) ~110Ksamples/sec flushCount=200K (less reliable) ~185K samples/sec

CSS Databrowser Oracle and influxDB data sources for the same PV over 9 days InfluxDB retrieval plugin does not do server side downsampling yet, so the lower graph includes spikes indicating the std deviation of the data points being averaged on the client side InfluxDB is added as a datasource, with the URL:port for the server

Non-EPICS data access in CSS During work on the influxDB plugins, we connected with other people at SNS using influxDB to log non-EPICS data Python scripts doing analysis and dumping samples into various schemas on various systems Using tools like grafana to view data Wanted CS-Studio databrowser functionality for their data Single environment with EPICS and non-EPICS data… Can we support both? How? Created a separate plugin for raw access to influxDB through CS-Studio No metadata required, it is all generated with default values User indicates “influxdb-raw” data source, then notates PVs with influxDB http request protocol: database name, measurement name, tags, field Any data stored in influxDB in any format is now viewable in CS-Studio

Data Browser for Generic InfluxDB EPICS schema Data Source influxdb://host.site.org:8086 PV “MyRecord” Any schema Data Source influxdb-raw://host.site.org:8086?... db=archive_test_data& user=..&password=.. PV “measurement,tag=value field” Measurement Tags (optional) Field RTBT_Diag:BCM25I:Power10,severity=NONE double.0 Drag search results

Grafana Generic InfluxDB data viewing tool https://grafana.com/ Free on Windows/Mac/Linux This graph took 5 minutes to setup Add data source for our InfluxDB server Set archive_test_data as the database Use the dropdown menu in the panel creator to select from a list of PVs in the database Use the time range selection at the upper right to set the range to graph Click and edit titles, axes, …

InfluxDB Time-series database Ideal for logging EPICS PVs Faster than RDB Supported by CS-Studio Archive Engine Data Browser Pending tests Archive Engine w/ SNS configs Synthetic data for longer time range based on actual site config files Checking free vs. commercial license support & multi-node installation