Computing Technology for Physics Research Jerome Lauret Axel Naumann.

Slides:



Advertisements
Similar presentations
 Contributing >30% of throughput to ATLAS and CMS in Worldwide LHC Computing Grid  Reliant on production and advanced networking from ESNET, LHCNET and.
Advertisements

WSUS Presented by: Nada Abdullah Ahmed.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
June 21, PROOF - Parallel ROOT Facility Maarten Ballintijn, Rene Brun, Fons Rademakers, Gunter Roland Bring the KB to the PB.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
Understanding and Managing WebSphere V5
Client/Server Grid applications to manage complex workflows Filippo Spiga* on behalf of CRAB development team * INFN Milano Bicocca (IT)
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
Hall D Online Data Acquisition CEBAF provides us with a tremendous scientific opportunity for understanding one of the fundamental forces of nature. 75.
PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
Computer System Architectures Computer System Software
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 1: Introduction What is an Operating System? Mainframe Systems Desktop Systems.
Grid Data Management A network of computers forming prototype grids currently operate across Britain and the rest of the world, working on the data challenges.
José M. Hernández CIEMAT Grid Computing in the Experiment at LHC Jornada de usuarios de Infraestructuras Grid January 2012, CIEMAT, Madrid.
Take on messages from Lecture 1 LHC Computing has been well sized to handle the production and analysis needs of LHC (very high data rates and throughputs)
Integrating HPC into the ATLAS Distributed Computing environment Doug Benjamin Duke University.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
Using Virtual Servers for the CERN Windows infrastructure Emmanuel Ormancey, Alberto Pace CERN, Information Technology Department.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
1 4/23/2007 Introduction to Grid computing Sunil Avutu Graduate Student Dept.of Computer Science.
David Adams ATLAS ADA, ARDA and PPDG David Adams BNL June 28, 2004 PPDG Collaboration Meeting Williams Bay, Wisconsin.
Grid Computing at Yahoo! Sameer Paranjpye Mahadev Konar Yahoo!
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
ALICE Offline Week | CERN | November 7, 2013 | Predrag Buncic AliEn, Clouds and Supercomputers Predrag Buncic With minor adjustments by Maarten Litmaath.
ESFRI & e-Infrastructure Collaborations, EGEE’09 Krzysztof Wrona September 21 st, 2009 European XFEL.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
USATLAS dCache System and Service Challenge at BNL Zhenping (Jane) Liu RHIC/ATLAS Computing Facility, Physics Department Brookhaven National Lab 10/13/2005.
WLCG Overview Board, September 3 rd 2010 P. Mato, P.Buncic Use of multi-core and virtualization technologies.
SLACFederated Storage Workshop Summary For pre-GDB (Data Access) Meeting 5/13/14 Andrew Hanushevsky SLAC National Accelerator Laboratory.
CIS250 OPERATING SYSTEMS Chapter One Introduction.
International Symposium on Grid Computing (ISGC-07), Taipei - March 26-29, 2007 Of 16 1 A Novel Grid Resource Broker Cum Meta Scheduler - Asvija B System.
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
Doug Benjamin Duke University. 2 ESD/AOD, D 1 PD, D 2 PD - POOL based D 3 PD - flat ntuple Contents defined by physics group(s) - made in official production.
Ian Bird Overview Board; CERN, 8 th March 2013 March 6, 2013
1 Cloud Services Requirements and Challenges of Large International User Groups Laurence Field IT/SDC 2/12/2014.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
ACAT 2010 Axel Naumann. What? International Workshop …on Advanced Computing …and Analysis Techniques …in Physics Research "Beyond the Cutting edge in.
Next Generation of Apache Hadoop MapReduce Owen
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
Distributed parallel processing analysis framework for Belle II and Hyper Suprime-Cam MINEO Sogo (Univ. Tokyo), ITOH Ryosuke, KATAYAMA Nobu (KEK), LEE.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES The Common Solutions Strategy of the Experiment Support group.
Geant4 GRID production Sangwan Kim, Vu Trong Hieu, AD At KISTI.
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
Barthélémy von Haller CERN PH/AID For the ALICE Collaboration The ALICE data quality monitoring system.
Scientific Computing at Fermilab Lothar Bauerdick, Deputy Head Scientific Computing Division 1 of 7 10k slot tape robots.
Storage discovery in AliEn
Scientific Data Processing Portal and Heterogeneous Computing Resources at NRC “Kurchatov Institute” V. Aulov, D. Drizhuk, A. Klimentov, R. Mashinistov,
Accessing the VI-SEEM infrastructure
Organizations Are Embracing New Opportunities
Virtualization and Clouds ATLAS position
Introduction to Distributed Platforms
Efficient Access to Massive Amounts of Tape-Resident Data
GGF OGSA-WG, Data Use Cases Peter Kunszt Middleware Activity, Data Management Cluster EGEE is a project funded by the European.
Readiness of ATLAS Computing - A personal view
Thoughts on Computing Upgrade Activities
Ákos Frohner EGEE'08 September 2008
SDM workshop Strawman report History and Progress and Goal.
LHC Data Analysis using a worldwide computing grid
Support for ”interactive batch”
Module 01 ETICS Overview ETICS Online Tutorials
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

Computing Technology for Physics Research Jerome Lauret Axel Naumann

Presentations 15 parallel contributions – 50% distributed computing, little online / control… Many well-designed posters 7 plenaries – wide range of innovative topics – thus vividly visited – thus no need to repeat here! 3 coffee breaks! tea breaks Axel Naumann - ACAT 2010 Track 1

Panels Multicore panel – Mohammad Al-Turany, Sverre Jarp, Alfio Lazzaro, Rama Malladi (Intel) Data management panel – Rene Brun, Andrew Hanushevsky, Tony Cass, Beob Kyun Kim, Alberto Pace Vivid discussion with audience Issues brought up will define future of our computing Axel Naumann - ACAT 2010 Track 1

PARALLEL PRESENTATIONS Axel Naumann - ACAT 2010 Track 1

Computing at Belle II Thomas Kuhr: Belle II going Grid Demand peaks might move to cloud Bookkeeping, metadata,… “nearest Grid site” Axel Naumann - ACAT 2010 Track 1

Distributed parallel processing analysis: Belle II, Hyper Suprime-Cam Sogo Mineo: Distributed parallel analysis framework ROOBASF – ROOT embedded, controls modular-structured workflow With MPI, workflow program- or data-parallelized Not merely event-parallel, also algorithm-parallel Quick development and quick analysis with boost.python: – analysis code in Python – analysis code in shared libraries called from Python Axel Naumann - ACAT 2010 Track 1

The ALICE Online Data Quality Monitoring Barthelemy Von Haller: AMORE (Automatic Monitoring Environment) based on ROOT Main repository for all monitoring data Multi-core: dedicated threads for time consuming operations + parallelization at event-level LHC restart: 35 monitoring agents publishing more than 3400 objects per second Axel Naumann - ACAT 2010 Track 1

Contextualization in Practice: The Clemson Experience Jerome Lauret: Grid has differing sites, idea: send environment with it! – portable (contextualization), convenient? Virtual Organization Clusters VOCs – Read-only VO VM on shared storage, writes overlaid to /tmp – Managed lifetime (queue,…) Promising test results, low overhead Axel Naumann - ACAT 2010 Track 1

EU-IndiaGrid2 - Sustainable e- infrastructures across Europe and India New high-bandwidth connection to India Connecting EU and India, also via Internet bandwidth there, how will it be used? Axel Naumann - ACAT 2010 Track 1

New WLCG services and ALICE’s AliEn Computing model* Fabrizio Furano: Workload management system stable Large level of expertise with the gLite-VOBOX service – ALICE is principal costumer Expected to exclude gLite-WMS service from its production environment in the next months Computing elements: CREAM-CE setup at all sites highest priorities 2010 ALICE infrastructure ready for data, demonstrated during the first data taking in November 2009 * Title shortened! Axel Naumann - ACAT 2010 Track 1

Tools to use heterogeneous Grid schedulers and storage system Mattia Cinquilli: CMS uses EGEE, OSG and NorduGrid, batch systems, CAFs; sites choose computational and storage solutions: inhomogeneous Need a standard interface BossLite handles interaction with different middleware/batch systems + logging facilities SEAPI: higher level API to storage protocols Results for of job efficiency, number of jobs and users in the case of CMS analysis applications Axel Naumann - ACAT 2010 Track 1

BNL Batch and DataCarousel systems Jerome Lauret: Problem: data taken in time sequence, access to tape often stochastic Potential for chaos, random and excessive overhead Solution: ERADAT – Efficient Retrieval and Access to Data Archived on Tape (File retrieval scheduler) DataCarousel – policy based for resource sharing Results RHIC/STAR: 6 MB/sec  46.2 MB/sec (LTO3) US-Atlas ESD: 1 MB/sec  72 MB/sec (LTO4) DataCarousel: 1.21 mount / tape only Conclusions Tape access optimization is crucial Practical tools in use for RHIC and US-Atlas production Code and knowledge shared with IN2P Axel Naumann - ACAT 2010 Track 1

Building Efficient Data Planner for Peta-scale Science Michal Zerola: Functional Planner, database, web interface – extensive studies of performance, simulation – Installed in STAR, currently running tests Perspectives: – multi-site transfers, similar benefits expected Controlled and efficient data movement: higher efficiency, coordination, load-balancing Axel Naumann - ACAT 2010 Track 1

Optimization of Grid Resources Utilization Costin Grigorias: Automation of storage operations by feeding back monitoring information in the decision- taking components of AliEn Flexible storage configuration – Addressing the storage elements by QoS tag only – Parallel storing of multiple replicas of the output Reliable and efficient file access – No more failed jobs due to auto discovery and failover in case of temporary problems – Use working storages closest to application Axel Naumann - ACAT 2010 Track 1

PROOF - Best Practices Fons Rademakers: Always fighting bottlenecks – file layout: much improved in ROOT 5.26 – merge: multiple mergers – disks, network: comparing storage solutions – RAM – CPUs: PROOF lite Feeding 24 cores not trivial Axel Naumann - ACAT 2010 Track 1

Optimizing CMS software to the CPU Settling on x86_64, using move for performance review with good tools Memory cost of 64bit under control Allocation locality is an issue 64bit math has surprises Multi-core – this year: fork for shared data – then: “more fine grained” – deployment? Axel Naumann - ACAT 2010 Track 1

"NoSQL" databases in CMS Data and Workflow Management Andrew Melo: SQL schemes seen as too rigid Non-SQL DBs (CouchDB, Hadoop, …) suitable for selected applications Commonly use map/reduce for query Successful implementation for logs etc Axel Naumann - ACAT 2010 Track 1

Teaching a Compiler your Coding Rules Axel Naumann: C++ 0x might require new generation of tools Need to control C+ 0x features LLVM as compiler toolset allows trivial implementations for some tools Example: coding rule checker Axel Naumann - ACAT 2010 Track 1

THEMES Clouds, GridLite, Cores, Tools, Data Axel Naumann - ACAT 2010 Track 1

Clouds From theory to practice to integration – Use cases still investigated – How to justify local batch vs. paying cloud Why Clouds == outsourcing? – Burst resource on peak demands? – Opportunistic resources truly at reach? Understand virtualization. Still open: – is it needed? do we want it? – is it transparent enough for end-users? – integrity, reproducibility? – image management, networking? Axel Naumann - ACAT 2010 Track 1

GridLite Usability Management Fights are over, reality arrived 1/3 of jobs lost: inherent property of the grid? Mattia Cinquilli Axel Naumann - ACAT 2010 Track 1

Multi- & Other-Cores x86_64 understood, optimizing An occasion to optimize / re-think our code Finally again competition across chips designs / architectures! What will be the surviving architecture? – Lots of dynamic … Axel Naumann - ACAT 2010 Track 1

Tools & Languages Many tools, only some mastered. Do we need new non-expert tools? – New compilers for parallelization – New tools for detecting language constructs / best practice / code analysis – Library / extension helping novice to develop parallel applications (OpenCL) Languages vs. optimizations / architectures: do we sell our souls? Standards! C++ 0x not a topic yet, unlike 64bit CPUs when they came out, Larrabee,… Axel Naumann - ACAT 2010 Track 1

Data Equilibrium money - CPU - I/O: alchemy CPU focus swinging to I/O: did we overestimate relevance of CPU (interpreter in event loop)? Cache hierarchy: tape, disks, SSDs, RAM – All of the above? – Throw a WAN in? – Need algorithms for data lifetime across tiers Latency vs. cooperation / concurrency, theory! 24 core (“wow!”) but no data coming through! Axel Naumann - ACAT 2010 Track 1

Conclusion We’re done. NO: the race goes on. Complexity is fought over generalization Handling and analyzing data will remain challenging in every respect Axel Naumann - ACAT 2010 Track 1

Thanks! Thanks to IAC for insight Sudhir for everlasting help All the chairs for creating wonderful workshop atmosphere Students etc for flawless technical support Speakers for a mosaic of present and future – and their help with this summary Let’s see what the future brings to Brunel! Axel Naumann - ACAT 2010 Track 1