Download presentation
Presentation is loading. Please wait.
Published byFay Joseph Modified over 6 years ago
1
PyTimber & CO M. Betz, R. De Maria, M. Fitterer, C. Hernalsteens, T. Levens Install: $ pip install pytimber Sources:
2
PyTimber: design criteria
Package goal: Provide a simple interface to CALS API for interactive data analysis in Python: get data with few lines of code works well in the typical scientific Python ecosystem: numpy (array objects), matplotlib (plotting library), ipython/jupyter (interactive environment), etc… use native python objects for input and output: string, lists, dictionary, floating point numbers, array keep python API predictable, expose simple building blocks, provide higher level functionality top of the building blocks keep performance under control use easily from in lxplus, technical network, swan, users machines reduce prerequisites to the minimum
3
PyTimber: example
4
PyTimber: prerequisites
Python 2.7 or Python 3.x jpype: critical package to connect to a JVM enabling the whole concept of leveraging on Java APIs. Not in the Python standard library, community supported, complex package built on top of Java native interface, not straightforward to install, not easy to use in concurrent processes cmnbuild_dep_manager: get jars, manage JVM, resolve class location use from CERN network
5
PyTimber: methods __init__: choose appid, clientid, source
search: from string pattern get: variables, time window, fundamental (optional) Variables: string, pattern, list of strings Time window: strings (localtime), unix time, datetime objects, ‘last’, ‘next’ tree: explore variable hierarchies getAligned: return align data getLHCFillData: return beam modes for a fill getLHCFillesByTime: return a list of fills getIntervalsByLHCModes: return a list fill numbers and intervals getMetaData: get all metadata defined by pattern or list
6
PyTimber: status and wish list
API stable, still few methods could be added Performance regressions from first releases to be investigated Better time zone management (choose UTC time as input, time zones explicitly) Documentation is still sparse and not complete: tutorial, examples in the source distribution, swan galleries, GitHub wiki, CO wiki Decide if pytimber should evolve towards a large framework: provide generic methods + application specific classes or keep as it is and use e.g. pytimbertools to provide the additional functionalities
7
Pytimber Tools: pagestore
Provide data persistence of pytimber time series. store everything pytimber can emit subset of pytimber API: search and get from unix timestamps fast bulk reading performance: data stored contiguously in binary files large dataset management ~TB separate data store (e.g. in EOS) path and data indexes (e.g. local) slow writing speed, brittle concurrent writing
8
Pytimber Tools: pagestore example
9
Pytimber Tools: dataquery
Encapsulate query in a object: use pytimber or pagestore as data source provide shortcuts to data array for interactive manipulations provide generic plots (lines, images for 2D arrays) method to extend datasets by appending/inserting data or adding variables Interpolate and align all variables
10
Pytimber Tools: dataquery example
11
Pytimber Tools: BSRT Class to process LHC BSRT data.
12
Outlook Pytimber proved to be an easy way to query CALS data.
Pytimber API simple and generic enough that can be used by other data service. Applications being built on top of Pytimber API. Actions and proposals: Formalize Python API and use it for future products related to machine data. Use pytimber package to provide additional tools to give a more complete experience, collect best practices, avoid reinventing wheels, reduce development time. CO being more involved in jpype maintenance. Profile pytimber and see if opportunities of speed-up emerges. Start using py.test for testing and improve documentation (Sphinx?)
13
Proposals for sub packages
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.