e-Infrastructures – future? (specifically in Europe)

Slides:



Advertisements
Similar presentations
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
Advertisements

Federated Identity Management for Research Communities (FIM4R) David Kelsey (STFC-RAL) EGI TF, AAI workshop 19 Sep 2012.
Ian Bird WLCG Workshop, Copenhagen 12 th November Nov 2013 Ian Bird; WLCG Workshop1.
Assessment of Core Services provided to USLHC by OSG.
Ian Bird LHCC Referees’ meeting; CERN, 11 th June 2013 March 6, 2013
Identity Management for Research Collaborations: from Pilots to Production Bob Jones IT dept CERN.
EGI-Engage EGI-Engage Engaging the EGI Community towards an Open Science Commons Project Overview 9/14/2015 EGI-Engage: a project.
Procurement Innovation for Cloud Services in Europe CERN – 14 May 2014 Bob Jones (CERN) This document produced by Members of the Helix Nebula consortium.
1 European policies for e- Infrastructures Belarus-Poland NREN cross-border link inauguration event Minsk, 9 November 2010 Jean-Luc Dorel European Commission.
Advanced Computing Services for Research Organisations Bob Jones Head of openlab IT dept CERN This document produced by Members of the Helix Nebula consortium.
Session Chair: Peter Doorn Director, Data Archiving and Networked Services (DANS), The Netherlands.
Helix Nebula The Science Cloud CERN – 14 May 2014 Bob Jones (CERN) This document produced by Members of the Helix Nebula consortium is licensed under a.
A public-private partnership building a multidisciplinary cloud platform for data intensive science Bob Jones Head of openlab IT dept CERN This document.
Cloud Services for Research CERN – 26 June 2014 Bob Jones (CERN) This document produced by Members of the Helix Nebula consortium is licensed under a Creative.
This document produced by Members of the Helix Nebula Partners and Consortium is licensed under a Creative Commons Attribution 3.0 Unported License. Permissions.
Helix Nebula The Science Cloud CERN – 13 June 2014 Alberto Di MEGLIO on behalf of Bob Jones (CERN) This document produced by Members of the Helix Nebula.
LHC Computing, CERN, & Federated Identities
A European Open Science Cloud
Ian Bird CERN, 17 th July 2013 July 17, 2013
Overview on European e-Infrastructure Augusto Burgueño DG CONNECT Porto, 18 June 2015 – GÉANT General Assembly.
Possibilities for joint procurement of commercial cloud services for WLCG WLCG Overview Board Bob Jones (CERN) 28 November 2014.
Ian Bird Overview Board; CERN, 8 th March 2013 March 6, 2013
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI strategy and Grand Vision Ludek Matyska EGI Council Chair EGI InSPIRE.
Interoperability and Integration of EGI with Helix Nebula - Workshop Sergio Andreozzi Strategy and Policy Manager (EGI.eu) 11/04/2013 EGI Community.
European Science Cloud for Research Towards a common vision Per Öster CSC – IT Center for Science Ltd.
Economical opportunities stemming from data and computing e- infrastructures Stakeholders consultation on computing and data for the WP Brussels,
Ian Bird, CERN 1 st February Dec 2015
SciencePAD Incubation Laboratory Alberto Di Meglio – CERN.
The Helix Nebula marketplace 13 May 2015 Bob Jones, CERN.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
EGI-InSPIRE EGI-InSPIRE RI EGI strategy towards the Open Science Commons Tiziana Ferrari EGI-InSPIRE Director at EGI.eu.
EMI is partially funded by the European Commission under Grant Agreement RI Commercial applications of open source middleware: the EMI and DCore.
Ian Bird LHCC Referees; CERN, 2 nd June 2015 June 2,
Work Plan for the Second Period Bob Jones, CERN First Helix Nebula Review 03 July This document produced by Members of the Helix Nebula consortium.
EGI-InSPIRE RI An Introduction to European Grid Infrastructure (EGI) March An Introduction to the European Grid Infrastructure.
Altre iniziative interessanti per T1 e INFN REsearch ACcelerator Hub – Discussione con Ian Bird sul progetto – Qualche notizia in più, vale la pena guardarci.
Ian Bird, CERN WLCG Project Leader Amsterdam, 24 th January 2012.
Evolution of storage and data management
Strengthening Dialogue and Building Trust April 2017
Bob Jones EGEE Technical Director
Accessing the VI-SEEM infrastructure
Sustainability of EMI Results
Continuous Delivery- Complete Guide
H2020, COEs and PRACE.
Impact of EU structural funds in research and innovation: the experience of the Lithuanian 'Valleys’ April, 2016.
Computing models, facilities, distributed computing
Business and Pricing Models
Innovative Solutions from Internet2
Exploitation and Sustainability updates
Ian Bird GDB Meeting CERN 9 September 2003
Scientific Computing Strategy (for HEP)
Steven Newhouse EGI-InSPIRE Project Director, EGI.eu
Steven Newhouse Project Director, EGI.eu
Summit 2017 Breakout Group 2: Data Management (DM)
National e-Infrastructure Vision
EGI.eu Technical Director EGI-Engage Technical Coordinator
EGEE support for HEP and other applications
EGI-Engage Engaging the EGI Community towards an Open Science Commons
Connecting the European Grid Infrastructure to Research Communities
Input on Sustainability
EGI – Organisation overview and outreach
Common Authentication and Authorisation Service for Life Science Research Mikael Linden, ELIXIR Finland.
Common Solutions to Common Problems
A Funders Perspective Maria Uhle Co-Chair, Belmont Forum Directorates for Geosciences, US National Science Foundation.
Integrating social science data in Europe
Brian Matthews STFC EOSCpilot Brian Matthews STFC
Juan Gonzalez eGovernment & CIP operations
WP6 – EOSC integration J-F. Perrin (ILL) 15th Jan 2019
Stakeholders R. Dimper 15 January 2019
EOSC-hub Contribution to the EOSC WGs
Presentation transcript:

e-Infrastructures – future? (specifically in Europe) Ian Bird WLCG Overview Board CERN, 29th October 2013 e-Infrastructures – future? (specifically in Europe) 15 Nov 2013 Ian Bird; WLCG OB

WLCG today Successfully supported LHC run 1 Many lessons have been learned – already several significant changes to the computing models Experiments pushing to higher and higher data rates Funding for future computing is a problem Flat budgets are the (optimistic) working assumption 15 Nov 2013 Ian Bird; WLCG OB

WLCG Strategies Reduce operational effort so that WLCG Tiers can be self supporting (no need for external funds for operations) Position ourselves so that the experiments can use pledged and opportunistic resources with ~zero configuration (Grid) clusters, clouds, HPC, … Collaborate with other science communities Share expertise, experience 15 Nov 2013 Ian Bird; WLCG OB

External funding? WLCG benefitted greatly from funding from EC, US (DoE/NSF), and other national initiatives This funding has largely stopped now Prospects for future funding exist – but the boundary conditions will be very different: Must demonstrate how we benefit other sciences and society at large Must engage with Industry (e.g. via PPP) HEP-only proposals unlikely to succeed Needs full engagement by all partners 15 Nov 2013 Ian Bird; WLCG OB

HEP value? Building and operation of the world’s largest globally federated, distributed, infrastructure Management of multi-petabyte data sets and facilities Record of collaborating with other scientific domains (EGEE, OSG), and industry (openlab, Helix Nebula, …) And more… Other sciences now need to address some of the same problems as HEP: we must collaborate This is one reason why we must avoid HEP-specific solutions as much as possible, we don’t have a good record of building broadly useful tools 15 Nov 2013 Ian Bird; WLCG OB

e-Infrastructure vision 15 Nov 2013 Ian Bird; WLCG OB

Background Existing e-Infrastructures (mostly) rely on on-going (short-term) project funding, at National and European level No good sustainable funding model so far And some impression of “free” resources has been given Recognise that coordination bodies do not control the resources On the other hand, large research e-infrastructures (such as WLCG) are directly funded Large research infrastructures have benefitted by being able to leverage the infrastructure investments But what about the “long tail” of science? These are also silos of expertise and experience – little sharing Operational costs are too high Funding scenarios are very mixed and country-dependent Long-tail – often want to use operations money for buying a few services Others – capital to spend occasionally, limited operations money Commercial e-infrastructure services have gained in popularity (Google docs, dropbox etc.) But have limitations when applied to large science (cost models, trustworthiness, data location, etc.) But there is no coherent engagement with industry 15 Nov 2013 Ian Bird; WLCG OB

A lot exists today Existing European e-infrastructure long-term projects GEANT, EGI, PRACE Many “pathfinder” initiatives have prototyped aspects of what will be needed in the future Includes much of the work in the existing e-Infrastructure projects but also projects such as EUDAT, Helix Nebula, OpenAIRE+, etc Thematic projects such as WLCG, BioMedBridges/ CRISP/ DASISH/ ENVRI, as well as Transplant, VERCE, Genesi-DEC and many others 15 Nov 2013 Ian Bird; WLCG OB

Evolution Is there a (hybrid) model that can: Support the long-tail of science as well as providing useful services for the large science users Realize a sustainable funding model that does not require long term funding for operations Does not exclude funding for specific developments and innovation 15 Nov 2013 Ian Bird; WLCG OB

E-Infrastructure Commons – key ideas Bring together public funded infrastructure and potential commercial partners into a hybrid model Innovation for emerging science-needs focused through Research Accelerator Hubs (ReAcH) Commercial partnerships commoditize the services Where and when realistic Encourage consolidation and commercial engagement Create consolidated innovative services for the broad science domain through less centers with broader reach Engage with industry to offer commodity services in a competitive and consistent way Ensure sustainability Innovate business models based on a paid service model Provide legal frameworks Define legal models that will allow for the rapid uptake of services 15 Nov 2013 Ian Bird; WLCG OB

EIROForum papers published EIROforum is a partnership between eight of Europe’s largest inter-governmental scientific research organizations that are responsible for infrastructures and laboratories: CERN, EFDA-JET, EMBL, ESA, ESO, ESRF, European XFEL and ILL. 3 EIROforum e-infrastructure papers published in 2013 A Vision for a European e-Infrastructure for the 21st Century:  https://cds.cern.ch/record/1550136/files/CERN-OPEN-2013-018.pdf Implementation of a European e-Infrastructure for the 21st Century:  https://cds.cern.ch/record/1562865/files/CERN-OPEN-2013-019.pdf Science, Strategy and Sustainable Solutions, a Collaboration on the Directions of E-Infrastructure for Science: https://cds.cern.ch/record/1545615/files/CERN-OPEN-2013-017.pdf 15 Nov 2013 Ian Bird; WLCG OB

Vision – principles Sustainable - RIs currently in construction (FAIR, XFEL, ELIXIR, EPOS, ESS, SKA, ITER and upgrades to ILL and ESRF etc.), need to be convinced that e-Infrastructure will exist and continue to evolve throughout their construction and operation phases if they are to take the risk and invest in its creation & exploitation Inclusive - Need an e-Infrastructure that supports the needs of the whole European research community, including the “long tail of science”, and interoperate with other regions Flexible - Cannot be a one-size-fits-all solution Integrated - Coherent set of services and tools must be available to meet the specific needs of each community Innovative - Essential that European industry engages with the scientific community to build and provide such services User driven - The user community should have a strong voice in the governance of the e-Infrastructure 15 Nov 2013 Ian Bird; WLCG OB

Governance by the Users Create a pan-European forum for organizations and projects that operate at an international level Present to the policy makers and the infrastructure providers the common needs, opinions and identify where there is divergence Independent of any supplier and engage across research domains Supplements but does not replace existing e-infrastructure user engagement channel Engages with the “long tail” of science Provides the essential “market” information to e-Infrastructure providers Market research deliverable including analysis and trends. 15 Nov 2013 Ian Bird; WLCG OB

Consolidation of Services Avoid fragmentation of users (big science vs. long tail) Avoid fragmentation of infrastructure (not integrated and duplicated services) Provide Common platforms (e-infrastructure commons) with 3 integrated areas International network, authorization & authentication, persistent digital identifiers small number of facilities to provide cloud and data services of general and widespread usage Software services and tools to provide value-added abilities to the research communities, in a managed repository Provide for data continuum - linking the different stages of the data lifecycle, from raw data to publication, and compute services to process this data 15 Nov 2013 Ian Bird; WLCG OB

Research Accelerator Hubs Build a hybrid model of public and commercial service suppliers into a network of Research Accelerator Hubs Make use of existing European e-infrastructures to jointly offer integrated services to the end-user ReAcH can be owned and operated by a mixture of commercial companies and public organisations offering a portfolio of services Services made available under a set of terms & conditions compliant with European jurisdiction & legislation and service definitions implementing recognised policies for trust, security and privacy notably for data protection A management board where the ReAcH operators are represented to provide strategic and financial oversight - coupled with the user forum A pilot service (2014) initially offering a limited set of services at prototype ReAcH 15 Nov 2013 Ian Bird; WLCG OB

Example from CERN This prototype will focus on data-centric services representing a platform on which more sophisticated services can be developed Use the resources installed by CERN at the Wigner Research Centre for Physics in Budapest, Hungary Services will be accessible via single sign-on through a fed id. mgmt. system (EDUGAIN) Multi-tenant compute environment to provision/manage networks of VMs on-demand (IaaS); ‘dropbox’ style service for secure file sharing over the internet; Point-to-point reliable, automated file transfer service for bulk data transfers; Long-term archiving service; Open access repository for publications and supporting data allowing users to create and control their own digital libraries (see www.zenodo.org); Integrated Digital Conferencing tools allowing users to manage their conferences, workshops and meetings; Online training material for the services 15 Nov 2013 Ian Bird; WLCG OB

Sustainability of CERN’s ReAcH Partners will curate their data-sets connect their identity federations deploy their community specific services & portals manage the interaction with their registered users and associated support activities Beyond the first year, partners engage to fund the cost of the services their users consume according to a pay-per-usage model (to be jointly-developed with CERN during the first year) 15 Nov 2013 Ian Bird; WLCG OB

Prototype – Example from EMBL-EBI This prototype will serve broad life science community based on successful Embassy cloud piloted since 2011 Use the resources installed by EMBL-EBI in its tier-3 data centres in London Services Well known resources and datasets: UniProtKB, Emsembl, PDBe, ENA; IaaS to other organisation (tenants – currently 8 from public & private sector); Private sector “pay at cost”; In 2014 will expand scale of resources; Support large-scale analysis of genomic data via partnership with International Cancer Genome Consortium; Integrate with other centres and technologies resulting from Helix Nebula to serve ELIXIR. 15 Nov 2013 Ian Bird; WLCG OB

Beyond the initial prototypes Learn from the prototype ReAcH to establish similar structures around Europe Not identical: each has its own portfolio of services and funding model All interconnected: to offer a continuum of services All integrated with public e-infrastructures: GEANT network commercial networks are not excluded! PRACE capability HPC centres EGI 15 Nov 2013 Ian Bird; WLCG OB

Introducing a pay-per-usage business model Majority of DCI sites are supported by national funding agencies based on the set-up & operational costs Propose to introduce a pay-per-usage model so funding is linked to level of usage Funding agencies can see the impact of a service hence have justification for their investment Give financial control to the users Encourage existing Virtual Research Communities to adopt this model They will choose services that offer better value-propositions Total cost of service provisioning will be reduced Services will continue to be free at the point of use 15 Nov 2013 Ian Bird; WLCG OB

What happens to existing DCI sites that are not equipped to become ReAcH? Many sites joined DCI projects in order to contribute to scientific challenges Volunteer computing structures offer an avenue by which they can continue to contribute but with reduced operational costs DEGISCO project International Desktop Grid Federation Integrate volunteer computing into the overall e-infrastructure commons EDGI project has developed bridges between volunteer computing and grids and clouds 15 Nov 2013 Ian Bird; WLCG OB

How does WLCG fit in this model? New model has 3 layers Base: Network, federated identity management, persistent identifiers Service: cloud service for generally useful data and compute services (IaaS, PaaS, SaaS) Tools and services: sharing (and focused development) of tools and services At all levels: data management, workflows, software libraries etc 15 Nov 2013 Ian Bird; WLCG OB

WLCG/HEP Basic assumption is that WLCG and other large-scale research infrastructures are more-or less self-sufficient But: Rely on base level services (1) Benefit: Network (GEANT/NRENs), Federated ID (eventually interacting/replacing grid certs) Contribute: expertise on world-wide trust federations, large scale network use and debugging, etc. Could make use (but don’t have to) of service layer (2) to deploy central services (for e.g.) Contribute: services such as FTS, data archives, etc Benefit: perhaps deployment of some services Contribute to, and benefit from, the software and sharing layer (3) Contribute: data management products; workflow engines (e.g. Dirac); many other tools developed for the grid Benefit: specific development funding for new tools or services; adaptation of useful services/tools from other sciences, etc. Software investments for common libraries, etc. should fit here 15 Nov 2013 Ian Bird; WLCG OB

What happens to the operational infrastructure of today? For WLCG nothing really changes We must ensure that services we rely on are supported by our community We must ensure that we fully support our operations But this must become much more lightweight and automated We need this anyway for opportunistic resources, and “unmanned” Tier 2s We must do this no matter what: Funding gap before H2020 starts Unlikely to be money to support operations in H2020 15 Nov 2013 Ian Bird; WLCG OB

An e-Infrastructure system Networks, Federated ID management, etc. Grid for community CCS for community Application software tools and services Cloud Resource(s) Data Archives HPC Facilities Collaborative tools and services Software investment Managed services – operated for research communities Individual science community operated services Key principles: Governed & driven by science/research communities Business model: Operations should be self-sustaining: Managed services are paid by use (e.g. Cloud services, data archive services, …) Community services operated by the community at their own cost using their own resources (e.g. grids, citizen cyberscience) Software support – open source, funded by collaborating developer institutions 15 Nov 2013 Ian Bird; WLCG OB

WLCG and payment? HEP does not pay twice HEP sites that (are funded to) provide services to WLCG and HEP as part of a collaboration assumed to continue like that There is a complexity of funding models to be resolved in practice Different countries fund computing for sciences in very different ways But does not change the principle of providing services for science with a cost per use 15 Nov 2013 Ian Bird; WLCG OB

Conclusion A new model has been proposed Combining commercial and public funded e-infrastructures Allowing for innovation and development leading to potential commoditization where appropriate Addressing large and small science Managing Governance and Sustainability Transitioning to an integrated service model Evolving existing e-infrastructures Prototype ReAcH and the business models will be tested in 2014 WLCG and HEP can benefit from the sustainable environment and funding for development/innovation 15 Nov 2013 Ian Bird; WLCG OB