Enhancements to ROOT performance benchmarking

Slides:



Advertisements
Similar presentations
Britain Southwick Nicole Anguiano March 29, 2014
Advertisements

R2: An application-level kernel for record and replay Z. Guo, X. Wang, J. Tang, X. Liu, Z. Xu, M. Wu, M. F. Kaashoek, Z. Zhang, (MSR Asia, Tsinghua, MIT),
Intel® performance analyze tools Nikita Panov Idrisov Renat.
03/29/2004CSCI 315 Operating Systems Design1 Page Replacement Algorithms (Virtual Memory)
Microsoft ® Official Course Monitoring and Troubleshooting Custom SharePoint Solutions SharePoint Practice Microsoft SharePoint 2013.
Page 1 © 2001 Hewlett-Packard Company Tools for Measuring System and Application Performance Introduction GlancePlus Introduction Glance Motif Glance Character.
CPU PROFILING FIND THE BOTTLENECK. WHAT? WHEN? HOW?
Chocolate Bar! luqili. Milestone 3 Speed 11% of final mark 7%: path quality and speed –Some cleverness required for full marks –Implement some A* techniques.
Introduction to Open Source Performance Tool --Linux Tool Perf Yiqi Ju (Fred) Sep. 13, 2012.
SC 2012 © LLNL / JSC 1 HPCToolkit / Rice University Performance Analysis through callpath sampling  Designed for low overhead  Hot path analysis  Recovery.
Martin Schulz Center for Applied Scientific Computing Lawrence Livermore National Laboratory Lawrence Livermore National Laboratory, P. O. Box 808, Livermore,
Embedded Software SKKU 14 1 Sungkyunkwan University Tizen v2.3 Application Profiling & Debugging.
® IBM Software Group © 2006 IBM Corporation PurifyPlus on Linux / Unix Vinay Kumar H S.
CHEP 2013, Amsterdam Reading ROOT files in a browser ROOT I/O IN JAVASCRIPT B. Bellenot, CERN, PH-SFT B. Linev, GSI, CS-EE.
ROOT I/O in JavaScript Browsing ROOT Files on the Web For more information see: For any questions please use following address:
Historical view for CMS job monitoring CERN, Summer student 2009 Student: PAVEL ŠIKTOROV Supervisor: JULIA ANDREEVA.
Software Engineering Prof. Dr. Bertrand Meyer March 2007 – June 2007 Chair of Software Engineering Lecture #20: Profiling NetBeans Profiler 6.0.
Profile, HAT, Wireless Toolkit’s Profile Sookmyung Women’s Univ. PSLAB Choi yoonjeong.
Beyond Application Profiling to System Aware Analysis Elena Laskavaia, QNX Bill Graham, QNX.
PINTOS: An Execution Phase Based Optimization and Simulation Tool) PINTOS: An Execution Phase Based Optimization and Simulation Tool) Wei Hsu, Jinpyo Kim,
 1- Definition  2- Helpdesk  3- Asset management  4- Analytics  5- Tools.
Web Design, 5 th Edition 6 Multimedia and Interactivity Elements.
TOPSpro Special Topics VI:TOPSpro for Instructors.
Java Flight Recorder and Java Mission Control
HedEx Lite Obtaining and Using Huawei Documentation Easily
SQL Database Management
Building Dashboards with JMP 13 Dan Schikore SAS, JMP
Marian Ivanov, Anar Manafov
Creating Visual Effects
Kai Li, Allen D. Malony, Sameer Shende, Robert Bell
Introduction to Operating Systems
Improving the support for ARM in IgProf
Process Management Process Concept Why only the global variables?
Software Architecture in Practice
MCTS Guide to Microsoft Windows 7
Introduction to Analysis of Algorithms
Introduction to Analysis of Algorithms
Valgrind, the anti-Alzheimer pill for your memory problems
Enhancing Web Map Performance in ArcGIS Online
TAU integration with Score-P
Tutorial 6 Topic: jQuery and jQuery Mobile Li Xu
Creating Visual Effects and Animation
Boost Linux Performance with Enhancements from Oracle
May 23-24, 2012 Microsoft.
Flight Recorder in OpenJDK
What we need to be able to count to tune programs
Capriccio – A Thread Model
Third Party Tools for SQL Server
Introduction to Operating Systems
An Empirical Study of Web Interface Design on Small Display Devices
DTC Troubleshooting and Support Webinar
QNX Technology Overview
PerfView Measure and Improve Your App’s Performance for Free
Module 2: Computer-System Structures
Adaptive Code Unloading for Resource-Constrained JVMs
Citation Map Visualizing citation data in the Web of Science
Best Practices for Migrating Custom Views in TrueSight Console
The Ins and Outs of Performance Profiling and Debugging
SEEM4570 Tutorial 5: jQuery + jQuery Mobile
Min Heap Update E.g. remove smallest item 1. Pop off top (smallest) 3
Performance And Scalability In Oracle9i And SQL Server 2000
Module 2: Computer-System Structures
SCORM Compliant Authoring Tool Developed at An-Najah University
Java Virtual Machine Profiling. Agenda Introduction JVM overview Performance concepts Monitoring Profiling VisualVM demo Tuning Conclusions.
TEE-Perf A Profiler for Trusted Execution Environments
What is a TEXT STRUCTURE?
Yale Digital Conference 2019
Power BI for the Consumer
Plugin development August 2019
Presentation transcript:

Enhancements to ROOT performance benchmarking Supervisors: O. Shadura, E. Guiraud L. Harutyunyan Harutyunyan American University of Armenia

Table of Content Rootbench: what and why? New benchmarks in rootbench.git Flamegraphs Improvements in RBSupport library Future Items

Rootbench: what and why? Continuous performance monitoring system for ROOT Rich and customizable visualizations and aids for performance analysis https://github.com/root-project/rootbench

Rootbench: what and why? Usually based on gbenchmark micro benchmarking infrastructure Micro-benchmarks focus on a single function or small piece of functionality Useful for monitoring performance of software hotspots https://github.com/google/benchmark

Rootbench: what and why? Technology Nightly Trigger rootbench.git Sign out Benchmark Visualization

Rootbench: what and why? In performance analysis of ROOT it is mostly interesting to see Wall clock time User CPU time Memory measurements (implemented only for interpreter benchmarks and for interpreted PyROOT ) Stack traces (new feature) Should I keep this slide?

New 26 benchmarks in rootbench.git

RDataFrame https://indico.cern.ch/event/587955/contributions/2937534/

New 14 RDataFrame benchmarks BM_RDataFrameSum_CreateDFFromTFileAndBookSum BM_RDataFrameSum_SumScalarDF BM_RDataFrameSum_SumScalarWithForeach BM_RDataFrameSum_SumScalarAfterXDefines BM_RDataFrameSum_SumVectorAfterXDefines ..and others RDF::Sum()

New 14 RDataFrame Benchmarks Should I talk about RDataFrame?

New TBranch & TTreeReader benchmarks BM_TTreePlayer_FixedSizeArrayTBranch BM_TTreePlayer_VarSizeArrayTBranch BM_TTreePlayer_StdVectorArrayTBranch BM_TTreePlayer_FixedSizeArrayReaderArray BM_TTreePlayer_VarSizeArrayReaderArray BM_TTreePlayer_StdVectorReaderArray ..and others TTree:: GetEntry(Long64_t event..) vs TTreePlayer::Next() We benchmarked two different types of ROOT Tree event loops readers! 1.https://root.cern.ch/doc/master/classTTree.html#a9fc48df5560fce1a2d63ecd1ac5b40cb 2.https://root.cern.ch/doc/master/classTTreeReader.html#af7b3aa2ea7b5b9a54b3aed57bab4d0d7

New TBranch:Sum() & TTreeReader::Sum() benchmarks BM_TTree_SumScalarTBranchGetEntry BM_TTree_SumScalarTTreeGetEntry BM_TTree_SumScalarTTreeReader BM_TTree_SumVectorTBranchGetEntry BM_TTree_SumVectorTTreeReader ..and others RDataFrame::Sum() vs Summation using TTreeReader vs Summation using TBranch::GetEntry() We benchmarked two different ways to do the summation operation both for RDF and ROOT TTree/TreePlayer event loop readers!

https://rootbnch-grafana-test. cern https://rootbnch-grafana-test.cern.ch/d/m1232LIZk/root-ttreeplayer-benchmarks?orgId=1

Performance analysis: Flame Graphs http://www.brendangregg.com/flamegraphs.html

Flame Graphs: what and why? Interactive visualizations of profiled software. Most frequent code-paths are identified quickly and accurately. Different types of flame graphs: CPU Memory Off-CPU Hot/Cold Differential

Flame Graphs: what and why? All data in one picture Interactive using JavaScript and browser Each box represents a function Box width is proportional to the total time a function was profiled directly or its children were profiled

Flame Graphs: Generation Profile event of interest (perf, eBPF, SystemTap, Instruments, DTrace etc.) Stackcollapse.pl Converts profile data into a single line record. Full output is many lines, one line per stack. Grep can be used to filter stacks before FlameGraphs

Flame Graphs: Generation 3. flamegraph.pl Converts folded stacks into an interactive SVG

FlameGraphs: CPU Measure code paths that consume CPU Understand and optimize CPU usage, improving performance and scalability Performed by sampling CPU stack traces at a timed interval

Sample population, sorted alphabetically (not time) Top edge shows who is on-CPU directly Stack Depth One Stack Sample Sample population, sorted alphabetically (not time)

Overhead Overhead

FlameGraphs: Memory Memory FlameGraphs are helpful when analyzing memory growth or leaks by tracing one of the following memory events: Allocator functions: malloc(), free() brk() syscall mmap() syscall Page faults

How to display generated FlameGraphs Generate them locally https://github.com/HLilit/rootbench/tree/flamegr aph-lilit/rootbench-scripts Download automatically generated flamegraphs directly from page Grafana SVG Panel (simple to store/display, work in progress to upload automatically) TTreeSum_SumScalarTBranchGetEntry, TTreeSum_SumScalarTTreeGetEntry, TTreeSum_SumScalarTTreeReader

Improvements in RBSupport library Added memory measurements for interpreted PyROOT and improved the memory measurements for ROOT interpreter. https://github.com/root-project/rootbench/pull/95

Future Items Finish porting iotools.git (IO benchmarks from ACAT 2017 presented by Jakob Blomer) into rootbench.git Finalize PRs about flamegraphs and improved memory measurements in rootbench.git                       https://github.com/root-project/rootbench/pull/94 https://github.com/root-project/rootbench/pull/95 Summer Student project final report - https://cds.cern.ch/record/2684053?ln=en

Thank you!