Track 3 summary Distributed Computing

Slides:



Advertisements
Similar presentations
QCloud Queensland Cloud Data Storage and Services 27Mar2012 QCloud1.
Advertisements

1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
1 Bridging Clouds with CernVM: ATLAS/PanDA example Wenjing Wu
Ian Bird WLCG Workshop Okinawa, 12 th April 2015.
Copyright © 2010 Platform Computing Corporation. All Rights Reserved.1 The CERN Cloud Computing Project William Lu, Ph.D. Platform Computing.
Ian Fisk and Maria Girone Improvements in the CMS Computing System from Run2 CHEP 2015 Ian Fisk and Maria Girone For CMS Collaboration.
LCG and HEPiX Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002.
Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.
DOSAR Workshop, Sao Paulo, Brazil, September 16-17, 2005 LCG Tier 2 and DOSAR Pat Skubic OU.
1. Maria Girone, CERN  Q WLCG Resource Utilization  Commissioning the HLT for data reprocessing and MC production  Preparing for Run II  Data.
ALICE Offline Week | CERN | November 7, 2013 | Predrag Buncic AliEn, Clouds and Supercomputers Predrag Buncic With minor adjustments by Maarten Litmaath.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
International Symposium on Grid Computing (ISGC-07), Taipei - March 26-29, 2007 Of 16 1 A Novel Grid Resource Broker Cum Meta Scheduler - Asvija B System.
1 Volunteer Computing at CERN past, present and future Ben Segal / CERN (describing the work of many people at CERN and elsewhere ) White Area lecture.
David Foster LCG Project 12-March-02 Fabric Automation The Challenge of LHC Scale Fabrics LHC Computing Grid Workshop David Foster 12 th March 2002.
12 March, 2002 LCG Applications Area - Introduction slide 1 LCG Applications Session LCG Launch Workshop March 12, 2002 John Harvey, CERN LHCb Computing.
Ian Bird Overview Board; CERN, 8 th March 2013 March 6, 2013
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
12 March, 2002 LCG Applications Area - Introduction slide 1 LCG Applications Session LCG Launch Workshop March 12, 2002 John Harvey, CERN LHCb Computing.
Alessandro De Salvo CCR Workshop, ATLAS Computing Alessandro De Salvo CCR Workshop,
Update on CHEP from the Computing Speaker Committee G. Carlino (INFN Napoli) on behalf of the CSC ICB, October
INTRODUCTION TO GRID & CLOUD COMPUTING U. Jhashuva 1 Asst. Professor Dept. of CSE.
Dominique Boutigny December 12, 2006 CC-IN2P3 a Tier-1 for W-LCG 1 st Chinese – French Workshop on LHC Physics and associated Grid Computing IHEP - Beijing.
1 ALICE Summary LHCC Computing Manpower Review September 3, 2003.
Scientific Computing at Fermilab Lothar Bauerdick, Deputy Head Scientific Computing Division 1 of 7 10k slot tape robots.
Computing infrastructures for the LHC: current status and challenges of the High Luminosity LHC future Worldwide LHC Computing Grid (WLCG): Distributed.
LHCb Computing 2015 Q3 Report Stefan Roiser LHCC Referees Meeting 1 December 2015.
EGI-InSPIRE RI An Introduction to European Grid Infrastructure (EGI) March An Introduction to the European Grid Infrastructure.
Ian Bird, CERN WLCG Project Leader Amsterdam, 24 th January 2012.
CEPC software & computing study group report
Review of the WLCG experiments compute plans
Workload Management Workpackage
Univa Grid Engine Makes Work Management Automatic and Efficient, Accelerates Deployment of Cloud Services with Power of Microsoft Azure MICROSOFT AZURE.
Running LHC jobs using Kubernetes
Organizations Are Embracing New Opportunities
(Prague, March 2009) Andrey Y Shevel
Computing Operations Roadmap
T1/T2 workshop – 7th edition - Strasbourg 3 May 2017 Latchezar Betev
Ian Bird WLCG Workshop San Francisco, 8th October 2016
AWS Integration in Distributed Computing
ALICE internal and external network
Report from WLCG Workshop 2017: WLCG Network Requirements GDB - CERN 12th of July 2017
Computing models, facilities, distributed computing
U.S. ATLAS Grid Production Experience
The “Understanding Performance!” team in CERN IT
ECRG High-Performance Computing Seminar
WLCG Manchester Report
of our Partners and Customers
Data Challenge with the Grid in ATLAS
Provisioning 160,000 cores with HEPCloud at SC17
for the Offline and Computing groups
How to enable computing
WLCG: TDR for HL-LHC Ian Bird LHCC Referees’ meting CERN, 9th May 2017.
PanDA in a Federated Environment
UK GridPP Tier-1/A Centre at CLRC
Grid Computing.
Dagmar Adamova (NPI AS CR Prague/Rez) and Maarten Litmaath (CERN)
Simulation use cases for T2 in ALICE
ALICE Computing Model in Run3
Summit 2017 Breakout Group 1: Advanced Research Computing (ARC)
FACTON Provides Businesses with a Cloud Solution That Elevates Enterprise Product Cost Management to a New Level Using the Power of Microsoft Azure MICROSOFT.
New strategies of the LHC experiments to meet
Xiaomei Zhang On behalf of CEPC software & computing group Nov 6, 2017
Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002
Ivan Reid (Brunel University London/CMS)
Exploring Multi-Core on
Welcome to (HT)Condor Week #19 (year 34 of our project)
Comparison of commercial cloud implementations
SOFTWARE INDUSTRY LIST
Presentation transcript:

Track 3 summary Distributed Computing Latchezar Betev for T3 14 October 2016

Track 3 Conveners Session chairs Many thanks to the student Tanya Levshina (FNAL), Weidong Li (IHEP), Latchezar Betev (CERN) Session chairs Andreas Peters, Barthelemy von Haller, Maarten Litmaath, Miguel Pedreira, Xavier Espinal Curull Many thanks to the student helpers for the efficient support!

T3 stats 55 abstracts 38 oral, 17 posters Room occupancy As usual, very difficult to make the oral/poster decision Room occupancy – almost full – 45-55 participants

General notes on T3 What it was not Common themes: Presentation of high-level all-encompassing frameworks design Common themes: Adaptation and tuning of said ‘high-level…’ to targeted local / opportunistic / someone else’s (LHC@HOME) / Commercial Cloud resources + making the Grid work better Introduction of more efficient and compact workflows Inclusion of external information sources to boost performance

General notes (2) Goals Getting the remaining 10-15% efficiency out of the existing Grid Aggressive inclusion of new resource types Looking into all corners to get what is needed for LHC and non-LHC computing… … in a progressively tighter resource market as more computing is needed to get the job done Thanks a lot LHC*! …and the upcoming upgrades …and other projects * - a very welcome problem

Central and internal job brokering Matching to fit resources types, availability and limits HLT farms, HPC+Supercomputers, Cloud, @HOME (and the Grid) - Memory limits management and internal brokering for multicore processing

Fabric management and global pools WEB proxies discovery standardization ‘Global pools’ for all resources and tasks Network-aware brokering: treat as resource and adjust the workload

Non-LHC VO support models Medium- and small collaborations, no knowledge of distributed computing, no manpower to develop own Robust, common, modular set of tools – reuse existing set as much as possible Cherenkov Telescope Array

User support Advanced tools for end-user analysis and small site operation as solution to many of the large-frame systems shortcomings Adapting the existing tools to support better central analysis activities – output formats and job priorities

Fabric evolution in response to evolving workflow Analysis of fabric parameters, feedback to the experiments to improve the upstream submission tools Coordinated data caches to improve analysis efficiency (between batch and data servers)

Bye Tiers! Tiers not mentioned a lot, but the dissolution is almost complete Excellent site and network performance are to blame …and that’s all I have to say about that

To Clouds! Clouds from single site, several geographically close centres, Cloud of clouds and heterogeneous resources (HPCs) Common theme – enabled for multiple user communities CERNVM, Containers, Mesos, Marathon, Calico, Docker, HTCondor - plenty of enabling technology New context-aware cloud scheduling Cloudy (bright) future

Cost of (commercial) Clouds From theory to practice - find the right bidding strategy for commercial clouds (AWS) Apply in practice – get many cores for full simulation workflow On premise resources still cheaper, but Cloud is a viable strategy if properly approached – need to do smart bidding Commercial Cloud prices are decreasing

More clouds Significant bursting capabilities, on par with large T1s (not really news) Acquiring the resources is becoming simpler and faster Several large-scale projects for commercial cloud procurement – to be used directly in production

Supercomputers and volunteering Sparse network (if any), may require specific SW porting, missing software delivery, non-x86 resources Access for non-LHC VOs through same tools Challenges abound, but developers are optimistic Getting resources ‘for free’ Very willing participants, great PR Up to 1-2% of total CPU resources Steady growth

From past ‘The End of the Day’ by Harold Septimus Powers

…to today

Thanks to all presenters for the interesting and engaging talks and posters (and largely sticking to the time limit) Many thanks to our LBNL and SLAC hosts for the great organization and support throughout the T3 sessions