Instrumenting CernVM-FS Code José Molina Colmenero CERN PH-SFT 1June 29th 2015.

Slides:



Advertisements
Similar presentations
® IBM Software Group © 2010 IBM Corporation What’s New in Profiling & Code Coverage RAD V8 April 21, 2011 Kathy Chan
Advertisements

Case Tools Trisha Cummings. Our Definition of CASE  CASE is the use of computer-based support in the software development process.  A CASE tool is a.
CiviContribute. This Week's Agenda CiviContribute is an online fundraising and donor management component which enables you to track and manage contributions.
Cacti Workshop Tony Roman Agenda What is Cacti? The Origins of Cacti Large Installation Considerations Automation The Current.
The Basic Tools Presented by: Robert E., & Jonathan Chase.
Automated Tests in NICOS Nightly Control System Alexander Undrus Brookhaven National Laboratory, Upton, NY Software testing is a difficult, time-consuming.
Chapter 3 Software Two major types of software
#RefreshCache CI - Daily Builds w/Jenkins – an Open Source Continuous Integration Server Nick Airdo Community Developer Advocate Central Christian Church.
Quality Assurance and Testing in LCG CHEP 2004 Interlaken, Switzerland 30 September 2004 Manuel Gallas, Jakub MOSCICKI CERN
SPI Software Process & Infrastructure GRIDPP Collaboration Meeting - 3 June 2004 Jakub MOSCICKI
Lesson 4 Computer Software
This presentation will guide you though the initial stages of installation, through to producing your first report Click your mouse to advance the presentation.
Arc: Programming Options Dr Andy Evans. Programming ArcGIS ArcGIS: Most popular commercial GIS. Out of the box functionality good, but occasionally: You.
SCRAM Software Configuration, Release And Management Background SCRAM has been developed to enable large, geographically dispersed and autonomous groups.
Tools and software process for the FLP prototype B. von Haller 9. June 2015 CERN.
SPI Software Process & Infrastructure EGEE France - 11 June 2004 Yannick Patois
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
M Gallas CERN EP-SFT LCG-SPI: SW-Testing1 LCG-SPI: SW-Testing LCG Applications Area GridPP 7 th Collaboration Meeting LCG/SPI LCG.
® IBM Software Group © 2009 IBM Corporation Rational Publishing Engine RQM Multi Level Report Tutorial David Rennie, IBM Rational Services A/NZ
Tutorial 121 Creating a New Web Forms Page You will find that creating Web Forms is similar to creating traditional Windows applications in Visual Basic.
Programming for Geographical Information Analysis: Advanced Skills Lecture 1: Introduction Programming Arc Dr Andy Evans.
Information Systems and Network Engineering Laboratory II DR. KEN COSH WEEK 1.
Contents 1.Introduction, architecture 2.Live demonstration 3.Extensibility.
Computer Emergency Notification System (CENS)
Copyright © 2015 – Curt Hill Version Control Systems Why use? What systems? What functions?
IFORM ACCOUNT MAINTENANCE ICT4D SESSION 4. IFORMBUILDER WEBSITE REQUIREMENTS To access the iFormBuilder website, you need the following items: -Reliable.
DEBUGGING. BUG A software bug is an error, flaw, failure, or fault in a computer program or system that causes it to produce an incorrect or unexpected.
Continuous Integration and Code Review: how IT can help Alex Lossent – IT/PES – Version Control Systems 29-Sep st Forum1.
Technical Presentation
LCG-SPI: SW-Testing LCG AppArea internal review (20/10/03)
EGEE-III INFSO-RI Enabling Grids for E-sciencE Overview of STEP09 monitoring issues Julia Andreeva, IT/GS STEP09 Postmortem.
14th Oct 2005CERN AB Controls Development Process of Accelerator Controls Software G.Kruk L.Mestre, V.Paris, S.Oglaza, V. Baggiolini, E.Roux and Application.
Cluster Consistency Monitor. Why use a cluster consistency monitoring tool? A Cluster is by definition a setup of configurations to maintain the operation.
EGEE is a project funded by the European Union under contract IST “Interfacing to the gLite Prototype” Andrew Maier / CERN LCG-SC2, 13 August.
Reading Flash. Training target: Read the following reading materials and use the reading skills mentioned in the passages above. You may also choose some.
Firmware - 1 CMS Upgrade Workshop October SLHC CMS Firmware SLHC CMS Firmware Organization, Validation, and Commissioning M. Schulte, University.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
UHCS 2005, slide 1 About Continuous Integration. UHCS 2005, slide 2 Why do you write Unit Test ? Improve quality/robustness of your code Quick feedback.
G.Govi CERN/IT-DB 1 September 26, 2003 POOL Integration, Testing and Release Procedure Integration  Packages structure  External dependencies  Configuration.
K. Harrison CERN, 22nd September 2004 GANGA: ADA USER INTERFACE - Ganga release status - Job-Options Editor - Python support for AJDL - Job Builder - Python.
SPI NIGHTLIES Alex Hodgkins. SPI nightlies  Build and test various software projects each night  Provide a nightlies summary page that displays all.
Oman College of Management and Technology Course – MM Topic 7 Production and Distribution of Multimedia Titles CS/MIS Department.
Build and Deployment Process Understand NCI’s DevOps and continuous integration requirements Understand NCI’s build and distribution requirements.
ATLAS-specific functionality in Ganga - Requirements for distributed analysis - ATLAS considerations - DIAL submission from Ganga - Graphical interfaces.
Geant4 is a toolkit to simulate the passage of particles through matter, and is widely used in HEP, in medical physics and for space applications. Ongoing.
Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.
Software - Utilities Objectives Understand what is meant by utility software and application software Look at common utilities – Security – Disk organisation.
INFSO-RI JRA2 Test Management Tools Eva Takacs (4D SOFT) ETICS 2 Final Review Brussels - 11 May 2010.
1 CSC160 Chapter 1: Introduction to JavaScript Chapter 2: Placing JavaScript in an HTML File.
Information Systems and Network Engineering Laboratory I DR. KEN COSH WEEK 1.
CIS-NG CASREP Information System Next Generation Shawn Baugh Amy Ramirez Amy Lee Alex Sanin Sam Avanessians.
Text2PTO: Modernizing Patent Application Filing A Proposal for Submitting Text Applications to the USPTO.
Advancing CernVM-FS and its Development Infrastructure José Molina Colmenero CERN EP-SFT.
Agenda:- DevOps Tools Chef Jenkins Puppet Apache Ant Apache Maven Logstash Docker New Relic Gradle Git.
Development Environment
Information Systems and Network Engineering Laboratory II
How to Contribute to System Testing and Extract Results
Project Objectives Publish to a remote server
CMS High Level Trigger Configuration Management
Diskpool and cloud storage benchmarks used in IT-DSS
4th Forum How to easily offer your application as a self-service template by using OpenShift and GitLab-CI 4th Forum Alberto.
LCGAA nightlies infrastructure
Skill Based Assessment - Entity Framework -
Teaching Computing to GCSE
SDMX: Enabling World Bank to automate data ingestion
Module 01 ETICS Overview ETICS Online Tutorials
Mark Quirk Head of Technology Developer & Platform Group
WEB DESIGN Cross 11, Tapovan Enclave Nala pani Road, Dehradun : ,
SSDT, Docker, and (Azure) DevOps
Presentation transcript:

Instrumenting CernVM-FS Code José Molina Colmenero CERN PH-SFT 1June 29th 2015

Main tasks so far Improve test coverage – Discover bugs and make sure everything works fine – Presentation of the results Benchmarking – To know the consequences of our changes – Presentation of the results Automatic integration: Jenkins 2June 29th 2015

Test coverage: current situation Mainly focused on developing new unit tests – C++: Currently 626 tests from 59 test cases – Python: 24 tests from 6 test cases (85% covered) – Also integration tests and stress tests check how the entire system works Main target: to reach a 75% test coverage at the end of 2015 Measurement performed using GCC Code Coverage Report and the Gcov toolGcov – Data presented using an external tool: gcovrgcovr – Results can be found herehere 3June 29th 2015

Test coverage: Gcov 4June 29th 2015

Test coverage: Gcov in detail 5June 29th 2015

Test coverage: Gcov global results Current status: If we include all CernVM-FS sources the total coverage is even lower – We expect to find around lines and we have currently tested around 43% – When I arrived the coverage was around 25% 6June 29th 2015

Test coverage: Jenkins Test coverage presentation is already integrated in Jenkins – It allows us to automatically keep the test coverage updated Jenkins project: CvmfsTestCoverage 7June 29th 2015

Test coverage: Jenkins It would be necessary to install a new plug-in to publish the test coverage XML report – Cobertura plug-in Cobertura plug-in – It would offer a visualization of the test coverage evolution HTML results with per-line details will be uploaded to a server – Check it out herehere 8June 29th 2015

Benchmarking: what for? It is important not only to know if what you do “apparently” does not break anything, but if the performance is good In CernVM-FS there are many elements that rely on others, so that if one of them is changed there could be an important impact in the general performance – How do we measure that? 9June 29th 2015

Benchmarking: initial status In CernVM-FS there is already developed a statistics system that provides diverse counters The graphical visualization can help a lot to spot trends over longer periods It is necessary to execute the program with a realistic run so that we can consider those counters as valid values 10June 29th 2015

Benchmarking: infrastructure Benchmarks for four important experiment frameworks – ATLAS, LHCB, ALICE and CMS at the moment Many different run options – Benchmarks to run, number of iterations, cold/warm cache, number of instances in parallel… – As flexible as possible in order to integrate easily new tools Already integrated in Jenkins – CvmfsBenchmark project CvmfsBenchmark project 11June 29th 2015

Benchmarking: analysis tools First try: IgprofIgprof – We could not make it work – Problems with forks It was necessary to modify the source code – CernVM-FS uses many signals and features that interfere with Igprof – Not meaningful results for a non-CPU-intense software as CernVM-FS due to long idle times Second try: ValgrindValgrind – Two tools at the moment: Callgrind and Memcheck – Necessary to add new starting options to avoid forks simple_option_parsing option Other tools could be easily added if necessary 12June 29th 2015

Benchmarking: Callgrind Callgrind is a Valgrind’s tool that shows the call graph and how much time we spend in each function – Good to determine the functions that should be improved to get a much better performance out of it It totally emulates the code, so the results are 100% accurate – But runs 20~30 times slower That is not a problem for CernVM-FS because is not CPU- intense Tool for a graphical visualization: kcachegrindkcachegrind 13June 29th 2015

Benchmarking: Callgrind output 14June 29th 2015

Benchmarking: Memcheck Memcheck is a Valgrind’s tool that searches for memory problems – Very helpful to identify potential memory bugs that are often hard to detect – But also to improve memory usage finding out memory leaks – However it does not have at the moment a friendly graphical visualization It is possible to give it a try using memcheckviewmemcheckview Integration with Jenkins – Valgrind plug-in that could be installed Valgrind plug-in 15June 29th 2015

Benchmarking: Valgrind tools A considerably big problem is that visualization tools are mostly desktop programs – Inconvenient to visualize the results – However, right now it is the only practical way An online tool would come in handy – No good ones at the moment – We tried with Webgrind, but it is an old project that does not solve all our requirements and has security leaksWebgrind 16June 29th 2015

Benchmarking: statistics CernVM-FS generates a statistics file that includes information about all the counters spread through the code – Also externally observed numbers are added by the benchmarking infrastructure: time spent and memory consumption at the moment A python script parses this file and sends the data to a special databasepython script – Distributed time series database: InfluxDBInfluxDB – SQL-like syntax for queries, but totally different approach for insertions Another third-party software takes care of the graphical representation – Grafana is a highly configurable program used to show online the evolution of our counters Grafana – You can find our instance herehere – New data is automatically sent every day through our Jenkins project 17June 29th 2015

Benchmarking: InfluxDB 18June 29th 2015

Benchmarking: Grafana 19June 29th 2015

Benchmarking: Grafana 20June 29th 2015

Benchmarking: configuring Grafana 21June 29th 2015

Benchmarking: branch comparison Situation: someone submits a pull request with a new feature – It compiles and passes the unit tests, and looks good – But does it have a considerable impact on the general performance? It is complicated to run the benchmarks on both branches – It requires to install the binaries – Generally not a good idea – It is necessary to use a cleaned environment every iteration 22June 29th 2015

Benchmarking: Docker Thanks to Docker it is possible to run the benchmarks on each branch using exactly the same environmentDocker – And it is cleaned up by itself Development of a python script that takes care of initializing the instances and running each benchmark on the concrete branchpython script – Highly configurable External repository and branch Basically the same benchmark parameters can be specified 23June 29th 2015

Benchmarking: Docker June 29th Extracted from

Benchmarking: comparison result The script generates a CSV file with the result 25June 29th 2015

Benchmarking: comparison example Recently we added sha2 functionality – It is supposed to consume more memory – But what does “more” mean? This case has been tested using the previously mentioned method – Outcome: sha2 uses around 2-3% more memory than sha1 (metadata caches) – We clearly know the cost of this feature 26June 29th 2015

Benchmarking: comparison integration This feature is also integrated in Jenkins – CvmfsBranchComparison project CvmfsBranchComparison – It provides a user-friendly interface to specify all necessary parameters 27June 29th 2015

Automatic Integration: Jenkins So far we have already mentioned important automatic integration projects on Jenkins: – All our benchmarks run once a day – Branch comparison through benchmarking Triggered on demand – Test coverage checking once a week 28June 29th 2015

Automatic Integration: projects 29June 29th 2015

Automatic Integration: documentation Enable the “make doc” functionality to generate Doxygen documentation in HTML formatDoxygen – After the generation the files are uploaded to a web server It is also fully integrated on Jenkins You can find the latest documentation herehere 30June 29th 2015

Automatic Integration: documentation 31June 29th 2015

Automatic Integration: GHPRB GitHub Pull Request Builder (GHPRB) is a Jenkins plug-in that automatically builds a recently pushed pull request and runs the unit tests on it GitHub Pull Request Builder – Previously presentedpresented We spotted a security leak – It did not check GitHub’s secret to be sure that the message was not sent by someone else 32June 29th 2015

Automatic Integration: GHPRB I wrote a pull request fixing the security leak and has recently being approvedpull request Which means we can now safely use that plug-in In order to make it work we would need to change the Jenkins access policy 33June 29th 2015

Automatic Integration: arm64 platform Recently we have successfully compiled CernVM-FS for arm64 – We could build it in a shared arm64 slave – We would need a dedicated arm64 machine for testing and debugging in order to fully support it It could be potentially invasive 34June 29th 2015

Lessons learnt so far How to effectively use all these tools Testing allowed us to find a few bugs in the code Testing methodology – Usage of mocking services for a clearer and easier testing development Benchmarking is important because: It allows us to quantify the cost of changes Spot long-term degradation of performance How to work in open source projects – Specially using distributed version control systems The automatic integration is important 35June 29th 2015

36June 29th 2015

Appendix: Jenkins wish list Cobertura plug-in – Publish test coverage results GHPRB plug-in – Automatically build and test a pull request Remove CERN log-in screen to receive POSTs notifications from GitHub 37June 29th 2015