Jeremy Coles 3rd November 2016

Slides:



Advertisements
Similar presentations
SLA-Oriented Resource Provisioning for Cloud Computing
Advertisements

ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
SCD in Horizon 2020 Ian Collier RAL Tier 1 GridPP 33, Ambleside, August 22 nd 2014.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
GridPP Steve Lloyd, Chair of the GridPP Collaboration Board.
SICSA student induction day, 2009Slide 1 Social Simulation Tutorial Session 6: Introduction to grids and cloud computing International Symposium on Grid.
Computing for ILC experiment Computing Research Center, KEK Hiroyuki Matsunaga.
DISTRIBUTED COMPUTING
Take on messages from Lecture 1 LHC Computing has been well sized to handle the production and analysis needs of LHC (very high data rates and throughputs)
Core operations Jeremy Coles GridPP28 17 th April 2012 Jeremy Coles GridPP28 17 th April 2012 a b.
Results of the HPC in Europe Taskforce (HET) e-IRG Workshop Kimmo Koski CSC – The Finnish IT Center for Science April 19 th, 2007.
PhysX CoE: LHC Data-intensive workflows and data- management Wahid Bhimji, Pete Clarke, Andrew Washbrook – Edinburgh And other CoE WP4 people…
CRISP & SKA WP19 Status. Overview Staffing SKA Preconstruction phase Tiered Data Delivery Infrastructure Prototype deployment.
1 Resource Provisioning Overview Laurence Field 12 April 2015.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Slide David Britton, University of Glasgow IET, Oct 09 1 Prof. David Britton GridPP Project leader University of Glasgow UK-T0 Meeting 21 st Oct 2015 GridPP.
PanDA Status Report Kaushik De Univ. of Texas at Arlington ANSE Meeting, Nashville May 13, 2014.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
EMI INFSO-RI European Middleware Initiative (EMI) Alberto Di Meglio (CERN)
LHC Computing, CERN, & Federated Identities
Ian Bird Overview Board; CERN, 8 th March 2013 March 6, 2013
Ian Collier, STFC, Romain Wartel, CERN Maintaining Traceability in an Evolving Distributed Computing Environment Introduction Security.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
IBERGRID as RC Total Capacity: > 10k-20K cores, > 3 Petabytes Evolving to cloud (conditioned by WLCG in some cases) Capacity may substantially increase.
EGI-InSPIRE EGI-InSPIRE RI The European Grid Infrastructure Steven Newhouse Director, EGI.eu Project Director, EGI-InSPIRE 29/06/2016CoreGrid.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
EMI is partially funded by the European Commission under Grant Agreement RI Commercial applications of open source middleware: the EMI and DCore.
EGI-InSPIRE RI EGI-InSPIRE RI EGI-InSPIRE Software provisioning and HTC Solution Peter Solagna Senior Operations Manager.
EGI-InSPIRE RI An Introduction to European Grid Infrastructure (EGI) March An Introduction to the European Grid Infrastructure.
Ian Bird, CERN WLCG Project Leader Amsterdam, 24 th January 2012.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI DCI Collaborations Steven Newhouse 15/09/2010 DCI Vision1.
Evolution of storage and data management
Review of the WLCG experiments compute plans
WLCG Workshop 2017 [Manchester] Operations Session Summary
AENEAS WP6 first conference call
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
H2020, COEs and PRACE.
Rosie Bolton Anna Scaife Michael Wise
EGEE Middleware Activities Overview
EOSC MODEL Pasquale Pagano CNR - ISTI
Computing models, facilities, distributed computing
European Middleware Initiative (EMI)
INTAROS WP5 Data integration and management
ATLAS Cloud Operations
WLCG Manchester Report
Christos Markou Institute of Nuclear Physics NCSR ‘Demokritos’
National e-Infrastructure Vision
How to enable computing
Bridges and Clouds Sergiu Sanielevici, PSC Director of User Support for Scientific Applications October 12, 2017 © 2017 Pittsburgh Supercomputing Center.
Grid Portal Services IeSE (the Integrated e-Science Environment)
UK Status and Plans Scientific Computing Forum 27th Oct 2017
LCG middleware and LHC experiments ARDA project
University of Technology
WLCG Collaboration Workshop;
Connecting the European Grid Infrastructure to Research Communities
DATA SPHINX & EUDAT Collaboration
MES Migration HGP Asia Knowledge Day 2017
EOSCpilot All Hands Meeting 8 March 2018 Pisa
What's New in eCognition 9
Computing & Data Resources Application interface
EMBRC - European Marine Biological Resource Center K. Deneudt, I. Nardello Pilot Blue Cloud Workshop March 28th, 2017 Brussels.
Brian Matthews STFC EOSCpilot Brian Matthews STFC
Overview of Workflows: Why Use Them?
For Community and TSC Discussion Bin Hu
What's New in eCognition 9
WP6 – EOSC integration J-F. Perrin (ILL) 15th Jan 2019
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

Jeremy Coles 3rd November 2016 The story so far … Jeremy Coles 3rd November 2016

SKA background Culture Conscious Mind Sub-conscious Unconscious Meaning of things in community Curation of important artifacts Analytical mind, Rationalisation Will Power Short Term memory Emotions Reflexes Habits Automatic functions (vital signs) Develop a concept and design for a distributed, federated European Science Data Centre (ESDC) to support the astronomical community in achieving the scientific goals of the SKA To Note: Want to hide the complexities. AIMS: Better understand expertise in UK; routes to collaboration (who?); understand mutual benefits (i.e. funding!); “Where’s the dictionary?”. What is the national view. AENEAS: Inventory of SKA science cases and post-SDP computing and data storage requirements Evaluation of existing HPC, cloud and distributed computing technologies Design and costing for distributed ESDC computing architecture Requirements for interfaces to SKA Science Archives & Other Repositories Next -> Identify contributors; check coverage; what can be in-kind… early start options.

GridPP infrastructure & approaches What is GridPP Key concepts of a Grid The 10% offer! Needed: Software deployment + Job & data management + User Interface. Links to use-cases and documentation Q&A: 1PB split over two sites 50:50. Note CERN-Wigner link on failover several times recently. GridPP will be a peer in UK-T0

ASTERICS & typical analysis Support and accelerate the implementation of the ESFRI telescopes, to enhance their performance beyond the current state-of-the-art, and to see them interoperate as an integrated, multi-wavelength and multi-messenger facility E-ELT CTA SKA KM3NET Q&A: Signed up to OpenData but need to build telescope first! GridPP funded for HEP.Infrastructure Influences. Challenge for commercial cloud: Cost & capability Analysis Requires Note Image based HTC or HPDA Like LSST Source finding HTC … if full Baysian approach HPC N -> N+1 comparisions Stacking HPDA Uses different sources Extraction dynamics HTC then HPDA Machine learning Visibility based Extreme HPDA/HPC Pipeline implemented

GridPP DIRAC, VMs, OpenStack, HTC, HPC HEP HTC model. How it evolved over 15 years. Late binding and pilot jobs HEP HTC no need for coherence HPC growing in HEP. Mixed workflows. Could add HPC to DIRAC production requests. Q: How to load balance for aggregation. A: Chained. Could more complex data dependencies be expressed at job level (dependency graph) Bridging – populate file catalogue directly? What about working with limited local resources? Pilots and matching.

OpenStack - HPC The overhead of virtualisation – remediation strategies Emerging capabilities for HPC Remaining gaps and options

Data transfer approaches Multi-TB/day routine. Reasonable intercontinental rates Use File Transfer Service (FTS) Helped DiRAC. Watch those file sizes. To Note: Have a minium & maximum file size specified Be able to handle transfer interrupts Consider carefully the storage middleware & transfer protocoles Checksum validation important Now looking at federated redirection services

Data management Why did the LFC get replaced!? To Note: Abstraction layers can be enemies of performance Think about deletion times! Considerations: What are the objectives? Testing for the purspoes of planning, R&D. To build-up experience with tools/techniques. Help with pre-construction/commissioning tasks? AENEAS is not doing prototyping – it is forbidden! WLCG got started in some form around 2003 with challenges. An infrastructure wide on in 2006… but years ahead of actual target production dates. Many things developed were wrong/changed/evolved, but that step had to be done. How much pain can GridPP absorb? There are things that it is difficult to write down. To learn one must try things out. Early on WLCG required heroic efforts. Can we run workflows on the same infrastructure? Are there network QoS criteria? Not really for HEP. Evolution of resources and approach partly driven by resource limitations… also requirements, finance… Why did the LFC get replaced!?

Data analytics - HEP Overview of data reduction operations; augmentation; derivation datasets. RUCIO for data management Analytics: Logs collection, storage & analysis Ops analysis, insight & steering Physics analysis NOTES: Don’t forget about deletion! Data lifetime considerations Pledged vs used resources – almost x2 Sharing resource – issue is the effort not the resources!

Data analytics - Spark Why Spark? Spark use cases Performance depends on problem scale and level of paralellism Linear algebra case study – classical MPI-based more efficient but development, interface, ecosystem considerations balance. Look at commodity cluster vs HPC platform Science drivers: Climate science; Nuclear physics & Mass Spectrometry Study Spark vs C+MPI scaling. Mitigate drawbacks of GFS.

Coming up…. Sketelescope.eu VO Manchester test case & experiences AENEAS objectives & what is needed SRCCG? Network considerations …. Planning and next steps SRC (not SKA!)