BESIII physical offline data analysis on virtualization platform Qiulan Huang Computing Center, IHEP,CAS CHEP 2015.

Slides:



Advertisements
Similar presentations
Module 13: Performance Tuning. Overview Performance tuning methodologies Instance level Database level Application level Overview of tools and techniques.
Advertisements

System Center 2012 R2 Overview
Status of BESIII Distributed Computing BESIII Workshop, Mar 2015 Xianghu Zhao On Behalf of the BESIII Distributed Computing Group.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
GridFlow: Workflow Management for Grid Computing Kavita Shinde.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Figure 1.1 Interaction between applications and the operating system.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
Grid Load Balancing Scheduling Algorithm Based on Statistics Thinking The 9th International Conference for Young Computer Scientists Bin Lu, Hongbin Zhang.
Workload Management Massimo Sgaravatto INFN Padova.
Company LOGO Development of Resource/Commander Agents For AgentTeamwork Grid Computing Middleware Funded By Prepared By Enoch Mak Spring 2005.
Efficiently Sharing Common Data HTCondor Week 2015 Zach Miller Center for High Throughput Computing Department of Computer Sciences.
Distributed Systems: Client/Server Computing
1 Bridging Clouds with CernVM: ATLAS/PanDA example Wenjing Wu
Slide 1 of 9 Presenting 24x7 Scheduler The art of computer automation Press PageDown key or click to advance.
Design and Implementation of a Single System Image Operating System for High Performance Computing on Clusters Christine MORIN PARIS project-team, IRISA/INRIA.
Copyright © 2010 Platform Computing Corporation. All Rights Reserved.1 The CERN Cloud Computing Project William Lu, Ph.D. Platform Computing.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
1 port BOSS on Wenjing Wu (IHEP-CC)
Chapter 3: Operating-System Structures System Components Operating System Services System Calls System Programs System Structure Virtual Machines System.
BESIII distributed computing and VMDIRAC
Robert Fourer, Jun Ma, Kipp Martin Copyright 2006 An Enterprise Computational System Built on the Optimization Services (OS) Framework and Standards Jun.
Cloud Computing Energy efficient cloud computing Keke Chen.
BaBar MC production BaBar MC production software VU (Amsterdam University) A lot of computers EDG testbed (NIKHEF) Jobs Results The simple question:
Module 10: Monitoring ISA Server Overview Monitoring Overview Configuring Alerts Configuring Session Monitoring Configuring Logging Configuring.
YAN, Tian On behalf of distributed computing group Institute of High Energy Physics (IHEP), CAS, China CHEP-2015, Apr th, OIST, Okinawa.
1 Evolution of OSG to support virtualization and multi-core applications (Perspective of a Condor Guy) Dan Bradley University of Wisconsin Workshop on.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
SSS Test Results Scalability, Durability, Anomalies Todd Kordenbrock Technology Consultant Scalable Computing Division Sandia is a multiprogram.
Data Placement and Task Scheduling in cloud, Online and Offline 赵青 天津科技大学
BESIII Production with Distributed Computing Xiaomei Zhang, Tian Yan, Xianghu Zhao Institute of High Energy Physics, Chinese Academy of Sciences, Beijing.
Frontiers in Massive Data Analysis Chapter 3.  Difficult to include data from multiple sources  Each organization develops a unique way of representing.
Stuart Wakefield Imperial College London Evolution of BOSS, a tool for job submission and tracking W. Bacchi, G. Codispoti, C. Grandi, INFN Bologna D.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
1 Chapter Overview Performing Configuration Tasks Setting Up Additional Features Performing Maintenance Tasks.
Distributed System Concepts and Architectures 2.3 Services Fall 2011 Student: Fan Bai
Microsoft Virtual Academy. STANDARDIZATION SELF SERVICEAUTOMATION Give Customers of IT services the ability to identify, access and request services.
Virtualization and Databases Ashraf Aboulnaga University of Waterloo.
IHEP(Beijing LCG2) Site Report Fazhi.Qi, Gang Chen Computing Center,IHEP.
Overview and Comparison of Software Tools for Power Management in Data Centers Msc. Enida Sheme Acad. Neki Frasheri Polytechnic University of Tirana Albania.
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
Arne Wiebalck -- VM Performance: I/O
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
OpenStack Chances and Practice at IHEP Haibo, Li Computing Center, the Institute of High Energy Physics, CAS, China 2012/10/15.
Building Virtual Scientific Computing Environment with Openstack Yaodong Cheng, CC-IHEP, CAS ISGC 2015.
Joint Institute for Nuclear Research Synthesis of the simulation and monitoring processes for the data storage and big data processing development in physical.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
Virtual Cluster Computing in IHEPCloud Haibo Li, Yaodong Cheng, Jingyan Shi, Tao Cui Computer Center, IHEP HEPIX Spring 2016.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
Platform & Engineering Services CERN IT Department CH-1211 Geneva 23 Switzerland t PES Agile Infrastructure Project Overview : Status and.
Core and Framework DIRAC Workshop October Marseille.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
Brian Lauge Pedersen Senior DataCenter Technology Specialist Microsoft Danmark.
PRIN STOA-LHC: STATUS BARI BOLOGNA-18 GIUGNO 2014 Giorgia MINIELLO G. MAGGI, G. DONVITO, D. Elia INFN Sezione di Bari e Dipartimento Interateneo.
Workload Management Workpackage
The advances in IHEP Cloud facility
Elastic Computing Resource Management Based on HTCondor
Integration of Openstack Cloud Resources in BES III Computing Cluster
Conditions Data access using FroNTier Squid cache Server
FCT Follow-up Meeting 31 March, 2017 Fernando Meireles
Discussions on group meeting
The Scheduling Strategy and Experience of IHEP HTCondor Cluster
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
HC Hyper-V Module GUI Portal VPS Templates Web Console
Outline Midterm results summary Distributed file systems – continued
Chapter 2: Operating-System Structures
Chapter 2: Operating-System Structures
Presentation transcript:

BESIII physical offline data analysis on virtualization platform Qiulan Huang Computing Center, IHEP,CAS CHEP 2015

Qiulan Huang/CC/IHEP Outline Overview of HEP computing in IHEP What is virtualized computing cluster? Why virtualized computing cluster? What we have done? Schedule BESIII jobs to virtual computing cluster Current status Conclusion

HEP computing in IHEP Support several experiments – BEPCII & BESIII – Cosmic Ray/Astrophysics in Tibet – DayaBay – CMS, ATLAS experiments on LHC – Accelerator-driven Sub- critical System (ADS) – China Spallation Neutron Source (CSNS) – Future experiments Jiangmen Underground Neutrino Observatory : ~500TB*10years Lhaasso : 2PB per year after 2017,accumulate 20PB+ in 10 years Qiulan Huang/CC/IHEP

Computing status in IHEP CPU cores ~ ~ 60 queues, managed by Torque Problems Low resource utilization Poor resource sharing Qiulan Huang/CC/IHEP

What is virtualized computing cluster? KVM+OpenStack+Torque /maui: Integrated Openstack and Torque /maui to provide computing service based on IHEPCloud Virtual cluster and physical cluster work together When a job queue is busy, the jobs can be allocated to a virtual queue VMs are created according to application requirement Qiulan Huang/CC/IHEP Distributed computing User interface Distributed computing User interface Resource Management CloudScheduler Submit jobs Query load Schedule to virtual Queue CloudAPI(create/stop VMs) Send results Monitoring Unified deployment (puppet) IHEPCloud (Virtual queue)

Why? Advantages Improve resource utilization Improve the efficiency of resource scheduling Simplify management Elasticity Resources heterogeneous is transparent to applications and users Energy saving To solve problems The overall computing resources utilization rate is relative low – IHEP Computing cluster supports various experiments such as BES, Daya Bay, YBJ… – Computing resource is separated by each experiment which cannot be shared. – At a certain period of time, some subsets are busy, some subsets are idle. It leads to take much time to queue. Qiulan Huang/CC/IHEP

What we have done? Qiulan Huang/CC/IHEP

BESIII Offline software optimization BESIII analysis is I/O heavy Creating event metadata and doing pre-selection by metadata according to event property to reduce IO throughput significantly Qiulan Huang/CC/IHEP Detailed: see xiaofeng’s talk, track2, 12:00 16/4 ( BESIII Physics Data Storing and Processing on HBase and MapReduce )

Optimized KVM performance Benchmark testing by default CPU performance penalty of KVM is about 10% and IO is about 12% Network performance penalty is about 3% CPU affinity The process is bound to a specific CPU but not allowed to dispatch to the other ones No need to migrate process between processors frequently to improve cache hit rate Extended Page Table Disabled EPT: modprobe kvm-intel enable_ept=0 affinity Qiulan Huang/CC/IHEP

Optimized CPU performance Default Optimized Qiulan Huang/CC/IHEP CPU benchmark testing Specifications Intel(R) Xeon(R) CPU X5650(2.67GHz),8 CPU cores,24GB内存 OS:SLC release 5.5 (Boron) el5.cve ,64bit KVM-83 Tools HEP-SPEC06 Optimized CPU performance increased about 3%

BESIII jobs running in VMs(1) BES simulation job (BOSS ) event number=1000 BesRndmGenSvc.RndmSeed=483366; #include "$BESSIMROOT/share/G4Svc_BesSim.txt" #include "$CALIBSVCROOT/share/calibConfig_sim.txt" RealizationSvc.RunIdList= {-9989}; #include "$ROOTIOROOT/share/jobOptions_Digi2Root.txt" DatabaseSvc.SqliteDbPath="/panfs/panfs.ihep.ac.cn/home/data/dengzy/pacman_bak/database"; RootCnvSvc.digiRootOutputFile= "/scratchfs/cc/shijy/rhopi-bws rtraw"; MessageSvc.OutputLevel= 5; ApplicationMgr.EvtMax=10000; Test environment VM: 2 cores,2GB memory Physics machine: 8cores,16GB memory Test results Jobs running time in VM is 1:45:05 while in physical machine is 1:42:04 which indicates the performance penalty is about 2.9% The penalty is optimistic The job requires 22% CPU ability Qiulan Huang/CC/IHEP

BESIII jobs running in VMs(2) BES analysis job(BOSS ) ApplicationMgr.EvtMax = 1E9 "/besfs2/offline/data/663-1/jpsi/tmp2/120520/run_ _All_file006_SFO-1.dst", "/besfs2/offline/data/663-1/jpsi/tmp2/120520/run_ _All_file006_SFO-2.dst" }; // Set output level threshold (2=DEBUG, 3=INFO, 4=WARNING, 5=ERROR, 6=FATAL ) MessageSvc.OutputLevel = 6; // Number of events to be processed (default is 10) ApplicationMgr.EvtMax = 1E9; ApplicationMgr.HistogramPersistency = "ROOT"; Testing specification VM: 1 CPU cores,2GB memory Physical machine: 8CPU cores,16GB memory Testing results Job running time in VM is 8:04:47 while in physical machine is 7:48:32 It takes more 975s in VM than in physical machine which illustrated the performance penalty is about 3% Qiulan Huang/CC/IHEP

IHEPCloud Lauched in May A private Iaas platform aiming to provide a self-service cloud platform for users and IHEP scientific computing Open for any user who has IHEP account (>1000 users, >70 active users) Qiulan Huang/CC/IHEP

CloudScheduler Integrated virtual computing cluster into the traditional physical cluster to optimize the resource utilization. Take fine-grained resource allocation to schedule tasks instead of taking nodes. Design flexible allocating policy to provisioning VMs dynamically, considering job types, system load and cluster real-time status. Schedule jobs to IHEPCloud Qiulan Huang/CC/IHEP

Architecture of Cloudscheduler PBS (VM Queue) Expends the original Torque PBS to support vm queue. VM central controller A matcher between the various modules. Polling  Calculate  Publish VM job controller Provide job query service schedule the jobs running in virtual queue (record the jobs in database) VM resource controller Policymaker: make vm allocation strategy. VM controller: start or stop vm CloudAPI: a packaged module based on the openstack api and with some extension. Job agent(deployed in VMs) Pull jobs to run, return job exit status and transfer the output files Qiulan Huang/CC/IHEP PBS (VM Queue) Central controller VM Resource Controller VM Job Controller IHEPCloud Job agent

Workflow Qiulan Huang/CC/IHEP PBS VM central controller VM Resource Controller VM Job Controller reply: job running number of each queue Job Agent check the vm queue and resource status Request: queue name Reply: job run and queue number of queue, VM ip addr scheduled jobs Request: VM tpye, total VM num and VM ip list scheduled jobs Reply: the num of active vm Reply: The maximum job number for queue start/stop vm Request: ask running number for each queue active vm number

Push+Pull mode Pull and push mode Pull: allocate the cpu/core for BESIII jobs with suitable resource  When new job is coming, the PBS will request VM central controller to get vm resource.  VM central controller prepares corresponding resource for the job  VMs request matched jobs Push: Cluster internal keeps the original way of “push”, schedule job into the virtualized queue.  Be transparent to users Qiulan Huang/CC/IHEP

VM allocating policy Allocating VMs dynamically Provisioning VM number is determined by current virtual cluster status How to allocate VMs according to load of cluster especially considering the information from monitoring system Configurable VM allocating strategy interface. Linear addition and subtraction. And so on… Qiulan Huang/CC/IHEP

Current status Completed the testbed Submitted hundreds of test jobs Still has some problems Sometimes message communicated between modules lost Jobagent service go to offline when not connect to the server side Next steps: Fix bugs Implement more vm allocation strategies Applied to other experiments like JUNO,DAYABAY,YBJ and so on. Provide online service in this year. Qiulan Huang/CC/IHEP

Summary Creating event metadata can reduce the IO throughput significantly CPU and network performance of KVM is optimistic, which can meet the BESIII experiment’s requirement Virtual computing cluster is a good supplement for the existing physical cluster Qiulan Huang/CC/IHEP

Qiulan Huang/CC/IHEP Any Question?