High Energy Physics – A big data use case Bob Jones Head of openlab IT dept CERN This document produced by Members of the Helix Nebula consortium is licensed.

Slides:



Advertisements
Similar presentations
An open source approach for grids Bob Jones CERN EU DataGrid Project Deputy Project Leader EU EGEE Designated Technical Director
Advertisements

EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Why Grids Matter to Europe Bob Jones EGEE.
Computing for LHC Dr. Wolfgang von Rüden, CERN, Geneva ISEF students visit CERN, 28 th June - 1 st July 2009.
A successful public- private partnership Alberto Di Meglio CERN openlab Head.
A successful public- private partnership Alberto Di Meglio CERN openlab CTO.
The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Dr Ian Bird LCG Project Leader 15 th April 2009 Visit of Spanish Royal Academy.
Interest for the Economy: Reaching Supersites sustainability through the creation of a science - commercial ecosystem This document produced by Members.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE-III Program of Work Erwin Laure EGEE-II / EGEE-III Transition Meeting CERN,
GridPP Steve Lloyd, Chair of the GridPP Collaboration Board.
Procurement Innovation for Cloud Services in Europe CERN – 14 May 2014 Bob Jones (CERN) This document produced by Members of the Helix Nebula consortium.
Advanced Computing Services for Research Organisations Bob Jones Head of openlab IT dept CERN This document produced by Members of the Helix Nebula consortium.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The EGEE Infrastructure and Remote Instruments.
Helix Nebula The Science Cloud CERN – 14 May 2014 Bob Jones (CERN) This document produced by Members of the Helix Nebula consortium is licensed under a.
Rackspace Analyst Event Tim Bell
The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Dr Ian Bird LCG Project Leader 25 th April 2012.
INFSO-RI Enabling Grids for E-sciencE EGEE and Industry Bob Jones EGEE-II Project Director Final EGEE Review CERN, May 2006.
ICE-DIP Mid-term review Data Transfer WP4a - ESR4: Aram Santogidis › 16/1/2015.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE – paving the way for a sustainable infrastructure.
A public-private partnership building a multidisciplinary cloud platform for data intensive science Bob Jones Head of openlab IT dept CERN This document.
INFSO-RI Enabling Grids for E-sciencE Plan until the end of the project and beyond, sustainability plans Dieter Kranzlmüller Deputy.
Cloud Services for Research CERN – 26 June 2014 Bob Jones (CERN) This document produced by Members of the Helix Nebula consortium is licensed under a Creative.
This document produced by Members of the Helix Nebula Partners and Consortium is licensed under a Creative Commons Attribution 3.0 Unported License. Permissions.
Bob Jones Technical Director CERN - August 2003 EGEE is proposed as a project to be funded by the European Union under contract IST
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE II: an eInfrastructure for Europe and.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Bob Jones EGEE project director CERN.
Ian Bird LHC Computing Grid Project Leader LHC Grid Fest 3 rd October 2008 A worldwide collaboration.
The LHC Computing Grid – February 2008 The Challenges of LHC Computing Dr Ian Bird LCG Project Leader 6 th October 2009 Telecom 2009 Youth Forum.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The use of grids for research infrastructures.
Les Les Robertson LCG Project Leader High Energy Physics using a worldwide computing grid Torino December 2005.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA1: Grid Operations Maite Barroso (CERN)
EGEE-III-INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE-III All Activity Meeting Brussels,
A successful public- private partnership Bob Jones head of CERN openlab.
Technical Workshop 5-6 November 2015 Alberto Di Meglio CERN openlab Head.
Helix Nebula The Science Cloud CERN – 13 June 2014 Alberto Di MEGLIO on behalf of Bob Jones (CERN) This document produced by Members of the Helix Nebula.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGI Operations Tiziana Ferrari EGEE User.
A successful public- private partnership Alberto Di Meglio CERN openlab Head.
Dr. Andreas Wagner Deputy Group Leader - Operating Systems and Infrastructure Services CERN IT Department The IT Department & The LHC Computing Grid –
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks How EGEE does Business Bob Jones EGEE-II.
LHC Computing, CERN, & Federated Identities
INFSO-RI Enabling Grids for E-sciencE The EGEE Project Owen Appleton EGEE Dissemination Officer CERN, Switzerland Danish Grid Forum.
INFSO-RI Enabling Grids for E-sciencE Quality Assurance Gabriel Zaquine - JRA2 Activity Manager - CS SI EGEE Final EU Review
Possibilities for joint procurement of commercial cloud services for WLCG WLCG Overview Board Bob Jones (CERN) 28 November 2014.
WP3 – Representation of requirements Robert Jenkins CEO CloudSigma 1 This document produced by Members of the Helix Nebula consortium is licensed under.
ATOS in Period 2 (WP4 leader) Technical Review Period 2 Michel van Adrichem, Mick Symonds, Josep Martrat This document produced by Members of the Helix.
WP6 – Inter-operability with e- infrastructures Sergio Andreozzi Strategy and Policy Manager, EGI.eu This document produced by Members of the Helix Nebula.
The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Dr Ian Bird LCG Project Leader 1 st March 2011 Visit of Dr Manuel Eduardo Baldeón.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE: Enabling grids for E-Science Bob Jones.
3 nd Helix Nebula Workshop on Interoperability among e-Infrastructures and Commercial Clouds Sergio Andreozzi Strategy and Policy Manager, EGI.eu EGI Technical.
Helix Nebula Workshop On Interoperability among Public And Community Clouds Session 2: Networking Connectivity Convener: Carmela ASERO, EGI.eu19 September.
Summary and next steps for the future Bob Jones, CERN Second Helix Nebula Review 26 June 2014 This document produced by Members of the Helix Nebula consortium.
A successful public- private partnership Alberto Di Meglio CERN openlab Head.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The EGEE project and the future of European.
WP9– Evaluation, roadmap & development plan This document produced by Members of the Helix Nebula consortium is licensed under a Creative Commons Attribution.
INFSO-RI Enabling Grids for E-sciencE EGEE general project update Fotis Karayannis EGEE South East Europe Project Management Board.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Introduction to Grids and the EGEE project.
A successful public- private partnership Maria Girone CERN openlab CTO.
Work Plan for the Second Period Bob Jones, CERN First Helix Nebula Review 03 July This document produced by Members of the Helix Nebula consortium.
Collaboration Board Meeting 11 March 2016 Alberto Di Meglio CERN openlab Head.
Bob Jones EGEE Technical Director
A successful public-private partnership
The LHC Computing Grid Visit of Mtro. Enrique Agüera Ibañez
Long-term Grid Sustainability
EGEE and gLite: an opportunity for industries
EGEE support for HEP and other applications
Connecting the European Grid Infrastructure to Research Communities
Input on Sustainability
EGI – Organisation overview and outreach
EGI Webinar - Introduction -
Cécile Germain-Renaud Grid Observatory meeting 19 October 2007 Orsay
Presentation transcript:

High Energy Physics – A big data use case Bob Jones Head of openlab IT dept CERN This document produced by Members of the Helix Nebula consortium is licensed under a Creative Commons Attribution 3.0 Unported License. Permissions beyond the scope of this license may be available at The Helix Nebula project is co-funded by the European Community Seventh Framework Programme (FP7/ ) under Grant Agreement no Members of the Helix Nebula consortiumCreative Commons Attribution 3.0 Unported Licensehttp://helix-nebula.eu/ Franco-British Workshop on Big Data in Science London, 6-7 November 2012

Accelerating Science and Innovation 2

MB/sec Data flow to permanent storage: 4-6 GB/sec 1.25 GB/sec 1-2 GB/sec

A distributed computing infrastructure to provide the production and analysis environments for the LHC experiments Managed and operated by a worldwide collaboration between the experiments and the participating computer centres The resources are distributed – for funding and sociological reasons Our task was to make use of the resources available to us – no matter where they are located Secure access via X509 certificates issued by network of national authorities - International Grid Trust Federation (IGTF) – WLCG – what and why? Tier-0 (CERN): Data recording Initial data reconstruction Data distribution Tier-1 (11 centres): Permanent storage Re-processing Analysis Tier-2 (~130 centres): Simulation End-user analysis 4

Castor service at Tier 0 well adapted to the load: – Heavy Ions: more than 6 GB/s to tape (tests show that Castor can easily support >12 GB/s); Actual limit now is network from experiment to CC – Major improvements in tape efficiencies – tape writing at ~native drive speeds. Fewer drives needed – ALICE had x3 compression for raw data in HI runs WLCG: Data Taking HI: ALICE data into Castor > 4 GB/s (red) HI: Overall rates to tape > 6 GB/s (r+b) 23 PB data written in PB in 2012 ! 23 PB data written in PB in 2012 !

Overall use of WLCG 10 9 HEPSPEC-hours/month (~150 k CPU continuous use) 10 9 HEPSPEC-hours/month (~150 k CPU continuous use) 1.5M jobs/day Usage continues to grow even over end of year technical stop -# jobs/day -CPU usage Usage continues to grow even over end of year technical stop -# jobs/day -CPU usage

Significant use of Tier 2s for analysis CPU –

WLCG has been leveraged on both sides of the Atlantic, to benefit the wider scientific community – Europe (EC FP7): Enabling Grids for E-sciencE (EGEE) European Grid Infrastructure (EGI) – USA (NSF): Open Science Grid (OSG) (+ extension?) Many scientific applications  Broader Impact of the LHC Computing Grid Archeology Astronomy Astrophysics Civil Protection Comp. Chemistry Earth Sciences Finance Fusion Geophysics High Energy Physics Life Sciences Multimedia Material Sciences … Archeology Astronomy Astrophysics Civil Protection Comp. Chemistry Earth Sciences Finance Fusion Geophysics High Energy Physics Life Sciences Multimedia Material Sciences … 8

Enabling Grids for E-sciencE EGEE-III INFSO-RI May EGEE – What do we deliver? Infrastructure operation –Sites distributed across many countries  Large quantity of CPUs and storage  Continuous monitoring of grid services & automated site configuration/management  Support multiple Virtual Organisations from diverse research disciplines Middleware –Production quality middleware distributed under business friendly open source licence  Implements a service-oriented architecture that virtualises resources  Adheres to recommendations on web service inter-operability and evolving towards emerging standards User Support - Managed process from first contact through to production usage –Training –Expertise in grid-enabling applications –Online helpdesk –Networking events (User Forum, Conferences etc.)

Enabling Grids for E-sciencE EGEE-III INFSO-RI May Sample of Business Applications SMEs –NICE (Italy) & GridWisetech (Poland): develop services on open source middleware for deployment on customer in- house IT infrastructure –OpenPlast project – (France) Develop and deploy Grid platform for plastics industry –Imense Ltd (UK) - Ported gLite application and GridPP sites Energy –TOTAL, UK - Ported application using GILDA testbed –CGGVeritas (France) – manages in-house IT infrastructures and sells services to petrochemical industry Automotive DataMat (Italy) – Provides grid services to automotive industry

CERN openlab in a nutshell A science – industry partnership to drive R&D and innovation with over a decade of success Evaluate state-of-the-art technologies in a challenging environment and improve them Test in a research environment today what will be used in many business sectors tomorrow Train next generation of engineers/employees Disseminate results and outreach to new audiences CONTRIBUTOR (2012) Bob Jones – CERN openlab

Virtuous Cycle CERN requirements push the limit Apply new techniques and technologies Joint development in rapid cycles Test prototypes in CERN environment Produce advanced products and services Bob Jones – CERN openlab 2012 A public-private partnership between the research community and industry 13

Inter-partner collaborations: 2 Fellows: 4 Summer Students: 6 Publications: 37 Presentations: 41 Reference Activities: over 15 Product enhancements: on 8 product lines Fellows: 4 Summer Students: 6 Publications: 37 Presentations: 41 Reference Activities: over 15 Product enhancements: on 8 product lines openlab III ( ) CERN openlab Board of Sponsors 2012

ICE-DIP Marie Curie proposal submitted in January 2012 to EC and accepted for funding (total 1.25M€ from EC): ICE-DIP, the Intel-CERN European Doctorate Industrial Program, is an EID scheme hosted by CERN and Intel Labs Europe. ICE-DIP will engage 5 Early Stage Researchers (ESRs). Each ESR will be hired by CERN for 3 years and will spend 50% of their time at Intel. Academic rigour and training quality is ensured by the associate partners, National University of Ireland Maynooth and Dublin City University, where the ESRs will be enrolled in doctorate programmes. Research themes: usage of many-core processors for data acquisition, future optical interconnect technologies, reconfigurable logic and data acquisition networks. Focus is the LHC experiments’ trigger and data acquisition systems 15

How to evolve WLCG? A distributed computing infrastructure to provide the production and analysis environments for the LHC experiments Collaboration - The resources are distributed and provided “in-kind” Service - Managed and operated by a worldwide collaboration between the experiments and the participating computer centres Implementation - Today general grid technology with high-energy physics specific higher-level services Evolve the Implementation while preserving the collaboration & service 16

CERN-ATLAS flagship configuration Monte Carlo jobs (lighter I/O) 10s MB in/out ~6-12 hours/job Ran ~40,000 CPU days Ramón Medrano Llamas,Fernando Barreiro, Dan van der Ster (CERN IT), Rodney Walker (LMU Munich) Difficulties overcome Different vision of clouds Different APIs Networking aspects

Conclusions The Physics community took the concept of a grid and turned into a global production quality service aggregating massive resources to meet the needs of the LHC collaborations The results of this development serve a wide range of research communities; have helped industry understand how it can use distributed computing; have launched a number of start-up companies and provided the IT service industry with new tools to support their customers Open source licenses encourage the uptake of the technology by other research communities and industry while ensuring the research community contribution is acknowledged Providing access to computing infrastructures by industry and research communities for prototyping purposes reduces the investment and risk for the adoption of new technologies October The LHC Computing Grid - Bob Jones

Conclusions Many research communities and business sectors are now facing an unprecedented data deluge. The Physics community with its LHC programme has unique experience in handling data at this scale The on-going work to evolve the LHC computing infrastructure to make use of cloud computing technology can serve as an excellent test ground for the adoption of cloud computing in many research communities, business sectors and government agencies The Helix Nebula initiative is driving the physics community exploration of how commercial cloud services can serve the research infrastructures of the future and provide new markets for European industry October The LHC Computing Grid - Bob Jones