Thoughts on Computing Upgrade Activities

Slides:

Advertisements

Similar presentations

Bernd Panzer-Steindel, CERN/IT WAN RAW/ESD Data Distribution for LHC.

Advertisements

Exporting Raw/ESD data from Tier-0 Tier-1s Wrap-up.

Cloud Storage in Czech Republic Czech national Cloud Storage and Data Repository project.

Distributed Xrootd Derek Weitzel & Brian Bockelman.

Priority Research Direction (I/O Models, Abstractions and Software) Key challenges What will you do to address the challenges? – Develop newer I/O models.

The Difficulties of Distributed Data Douglas Thain Condor Project University of Wisconsin

Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.

Ian Fisk and Maria Girone Improvements in the CMS Computing System from Run2 CHEP 2015 Ian Fisk and Maria Girone For CMS Collaboration.

José M. Hernández CIEMAT Grid Computing in the Experiment at LHC Jornada de usuarios de Infraestructuras Grid January 2012, CIEMAT, Madrid.

Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.

IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.

Take on messages from Lecture 1 LHC Computing has been well sized to handle the production and analysis needs of LHC (very high data rates and throughputs)

IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.

Your university or experiment logo here Caitriana Nicholson University of Glasgow Dynamic Data Replication in LCG 2008.

Storage Tank in Data Grid Shin, SangYong(syshin, #6468) IBM Grid Computing August 23, 2003.

9 September 2008CIS 340 # 1 Topics reviewTo review the communication needs to support the architectures variety of approachesTo examine the variety of.

The Future of the iPlant Cyberinfrastructure: Coming Attractions.

 Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over a network (typically the Internet). 

1. Maria Girone, CERN  Q WLCG Resource Utilization  Commissioning the HLT for data reprocessing and MC production  Preparing for Run II  Data.

Grid Lab About the need of 3 Tier storage 5/22/121CHEP 2012, The need of 3 Tier storage Dmitri Ozerov Patrick Fuhrmann CHEP 2012, NYC, May 22, 2012 Grid.

Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.

Fermi National Accelerator Laboratory SC2006 Fermilab Data Movement & Storage Multi-Petabyte tertiary automated tape store for worldwide HEP and other.

CERN openlab V Technical Strategy Fons Rademakers CERN openlab CTO.

Cloud Age Time to change the programming paradigm?

CERN – IT Department CH-1211 Genève 23 Switzerland t Working with Large Data Sets Tim Smith CERN/IT Open Access and Research Data Session.

6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.

Slide David Britton, University of Glasgow IET, Oct 09 1 Prof. David Britton GridPP Project leader University of Glasgow UK-T0 Meeting 21 st Oct 2015 GridPP.

Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,

CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,

Computing for LHC Physics 7th March 2014 International Women's Day - CERN- GOOGLE Networking Event Maria Alandes Pradillo CERN IT Department.

Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.

1.3 ON ENHANCING GridFTP AND GPFS PERFORMANCES A. Cavalli, C. Ciocca, L. dell’Agnello, T. Ferrari, D. Gregori, B. Martelli, A. Prosperini, P. Ricci, E.

Maria Girone, CERN CMS Experiment Status, Run II Plans, & Federated Requirements Maria Girone, CERN XrootD Workshop, January 27, 2015.

Ian Bird Overview Board; CERN, 8 th March 2013 March 6, 2013

Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Data architecture challenges for CERN and the High Energy.

Next Generation of Apache Hadoop MapReduce Owen

The VERSO Product Returns Portal Incorporates Office 365 Outlook and Excel Add-Ins to Create Seamless Workflow for All Participating Users OFFICE 365 APP.

Big Data for Big Discoveries How the LHC looks for Needles by Burning Haystacks Alberto Di Meglio CERN openlab Head DOI: /zenodo.45449, CC-BY-SA,

Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.

Maria Girone, CERN  CMS in a High-Latency Environment  CMSSW I/O Optimizations for High Latency  CPU efficiency in a real world environment  HLT 

1 Open Science Grid: Project Statement & Vision Transform compute and data intensive science through a cross- domain self-managed national distributed.

IT-DSS Alberto Pace2 ? Detecting particles (experiments) Accelerating particle beams Large-scale computing (Analysis) Discovery We are here The mission.

CMS Experience with the Common Analysis Framework I. Fisk & M. Girone Experience in CMS with the Common Analysis Framework Ian Fisk & Maria Girone 1.

Scientific Computing at Fermilab Lothar Bauerdick, Deputy Head Scientific Computing Division 1 of 7 10k slot tape robots.

Computing infrastructures for the LHC: current status and challenges of the High Luminosity LHC future Worldwide LHC Computing Grid (WLCG): Distributed.

Canadian Bioinformatics Workshops

DATA ACCESS and DATA MANAGEMENT CHALLENGES in CMS.

ScotGRID is the Scottish prototype Tier 2 Centre for LHCb and ATLAS computing resources. It uses a novel distributed architecture and cutting-edge technology,

ATLAS – statements of interest (1) A degree of hierarchy between the different computing facilities, with distinct roles at each level –Event filter Online.

CERN IT-Storage Strategy Outlook Alberto Pace, Luca Mascetti, Julien Leduc

Review of the WLCG experiments compute plans

ESign365 Add-In Gives Enterprises and Their Users the Power to Seamlessly Edit and Send Documents for e-Signature Within Office 365 OFFICE 365 APP BUILDER.

Ian Bird WLCG Workshop San Francisco, 8th October 2016

Report from WLCG Workshop 2017: WLCG Network Requirements GDB - CERN 12th of July 2017

Computing models, facilities, distributed computing

Methodology: Aspects: cost models, modelling of system, understanding of behaviour & performance, technology evolution, prototyping  Develop prototypes.

FUTURE ICT CHALLENGES IN SCIENTIFIC COMPUTING

for the Offline and Computing groups

Introduction to Data Management in EGI

WLCG: TDR for HL-LHC Ian Bird LHCC Referees’ meting CERN, 9th May 2017.

Dagmar Adamova, NPI AS CR Prague/Rez

Dagmar Adamova (NPI AS CR Prague/Rez) and Maarten Litmaath (CERN)

A Modern Intranet Integration that Extends the Value of Your Microsoft Office 365 Deployment, Boosts Productivity, and Enhances Collaboration OFFICE 365.

R&D for HL-LHC from the CWP

ALICE Computing Model in Run3

The latest developments in preparations of the LHC community for the computing challenges of the High Luminosity LHC Dagmar Adamova (NPI AS CR Prague/Rez)

New strategies of the LHC experiments to meet

Grid Canada Testbed using HEP applications

SDM workshop Strawman report History and Progress and Goal.

Dell Data Protection | Rapid Recovery: Simple, Quick, Configurable, and Affordable Cloud-Based Backup, Retention, and Archiving Powered by Microsoft Azure.

Presentation transcript:

Thoughts on Computing Upgrade Activities

Origins These thoughts primarily came from The White Papers contributed to the CWP activity by Maria Girone, Oliver Gutsche, Liz Sexton-Kennedy, and myself Some ideas that have been batted about Includes General ideas on architectures Technology demonstrators we should be doing

Challenges The challenge of the HL-LHC is about data Data Complexity and date rate both increase However from a computing resource perspective some elements are more challenging than others Storage is much closer than the expected technology increase than processing 6-8 times increase needed in storage but potentially 20 times increase in processing This is because the processing times for complex events grows faster than their size, which tends to be linear with pile-up Impact of this is we need to work harder at how we process than simply how we store

Model CERN Dedicated Resource Dedicated Resource Dedicated Resource Clouds University Sites Supercomputers Opportunistic Computing Grids iPhone Apps

Data Replication The whole system relies on the ability to move data We have relied on GridFTP for data movement TCP requires many streams to get reasonable transfer rates Pushes toward a program with parallel capabilities like GridFTP Do we want to look at alternatives? UDP with checksums? Socket based connections? Commercial offerings like Aspera? Potential Technology Demonstrator would be to look at an alternative to GridFTP Firewall friendly, deployable in many environments, capable of filling a 100Gb/s link over a reasonable latency connection

Resources: Storing Data It is not obvious to me that on the time scale of HL-LHC that we should imagine data ever comes off tape except for disaster recovery ORACLE is getting out of the tape business We probably still want a tape copy, but we probably don’t care that all large sites have tape There will be fewer manufacturers of tape drives We will still have tiered storage Demonstrator use erasure code written object storage as a cheap intermediate disk layer

Distributing Data Since much of this model would rely on making effective use of a variety of resources away from where the data is permanently stored and served from. We think we need demonstrators on data management Named Data Networks (NDN) combined with Software Defined Networks (SDN) to ensure that a group of files can arrive at a remote site for processing Integrate meta data and data relationships with network layer This is a potential project with the next phase of Openlab Improved data federation and smarter caching Demonstrate use of sites with no dedicated storage and only caching Evolve FAX and AAA in common Development of smarter caches Similar elements to NDN Processing Cache Storage CPU

Resources: Processing A lot of the resources we rely on to close the gap will not be dedicated processing we bought Will be cloud computing that was donated or we payed for Engagements with commercial cloud providers have gone well Will be HPC shares Will be opportunistic There is no reason to think these resources will or should come with a flat profile Technology demonstrator we would like to do with Condor 10M processing jobs executed Condor executing “work” and ”job” scheduling is under the hood Provisioning for Peak

Physics Data Reduction Dealing with the data and the wide area gets a lot simpler if there is less data Using data analytics tools to support data selection and reduction Goal is to use map/reduce type selections to take PBs and make them TBs for more detailed offline analysis On-going demonstrator with Intel through openlab Dedicated Resource Clouds Grids

Data Transformation The simple cut based data reduction might evolve into more advanced data refresh Accessing a data quantity could trigger recalculating it See data reconstruction not necessarily as a big event loop, but as a series of discreet data transformations Look at new programming languages and paradigms New Project with the EU called DEEP-EST using HPC resources would be an interesting place to test some of these ideas

Demonstrators Data Transfer Use of Object Store for the layer between disk and tape Named Data Networks Advanced Caching Scalable Workflow Scheduling Data Reduction Data Transformation

Outlook I think we have some techniques to look at and the basics of a computing model general enough to support almost anything I don’t think we know how much any of the technologies we will investigate will actually go toward closing the resource gap