Grid Computing: dealing with GB/s dataflows David Groep, NIKHEF Graphics: Real Time Monitor, Gidon Moont, Imperial College London, see

Slides:



Advertisements
Similar presentations
Particle physics – the computing challenge CERN Large Hadron Collider –2007 –the worlds most powerful particle accelerator –10 petabytes (10 million billion.
Advertisements

Nikhef Jamboree 2008 BiG Grid Update Jan Just Keijser.
Randall Sobie The ATLAS Experiment Randall Sobie Institute for Particle Physics University of Victoria Large Hadron Collider (LHC) at CERN Laboratory ATLAS.
e-ScienceTalk: Supporting Grid and High Performance Computing Reporting across Europe GA No September 2010 – 31 May 2013.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Infrastructure overview Arnold Meijster &
SARA Reken- en Netwerkdiensten ToPoS: High-Throughput Parallel Processing Pipelines on the Grid Pieter van Beek SARA Computing and Networking Services.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Dr Ian Bird LCG Project Leader 15 th April 2009 Visit of Spanish Royal Academy.
SICSA student induction day, 2009Slide 1 Social Simulation Tutorial Session 6: Introduction to grids and cloud computing International Symposium on Grid.
What is EGI? The European Grid Infrastructure enables access to computing resources for European scientists from all fields of science, from Physics to.
José M. Hernández CIEMAT Grid Computing in the Experiment at LHC Jornada de usuarios de Infraestructuras Grid January 2012, CIEMAT, Madrid.
Storage and data services eIRG Workshop Amsterdam Dr. ir. A. Osseyran Managing director SARA
A short introduction to the Worldwide LHC Computing Grid Maarten Litmaath (CERN)
Astro-WISE & Grid Fokke Dijkstra – Donald Smits Centre for Information Technology Andrey Belikov – OmegaCEN, Kapteyn institute University of Groningen.
Dutch Tier Hardware Farm size –now: 150 dual nodes + scavenging 200 nodes –buildup to ~1500 up-to-date nodes in 2007 Network –now: 2 Gbit/s internatl.
Data Grid projects in HENP R. Pordes, Fermilab Many HENP projects are working on the infrastructure for global distributed simulated data production, data.
10/24/2015OSG at CANS1 Open Science Grid Ruth Pordes Fermilab
DataTAG Research and Technological Development for a Transatlantic Grid Abstract Several major international Grid development projects are underway at.
Grid Computing Status Report Jeff Templon PDP Group, NIKHEF NIKHEF Scientific Advisory Committee 20 May 2005.
SouthGrid SouthGrid SouthGrid is a distributed Tier 2 centre, one of four setup in the UK as part of the GridPP project. SouthGrid.
GridPP Deployment & Operations GridPP has built a Computing Grid of more than 5,000 CPUs, with equipment based at many of the particle physics centres.
The DutchGrid Platform – An Overview – 1 DutchGrid today and tomorrow David Groep, NIKHEF The DutchGrid Platform Large-scale Distributed Computing.
GridPP Building a UK Computing Grid for Particle Physics Professor Steve Lloyd, Queen Mary, University of London Chair of the GridPP Collaboration Board.
Key prototype applications Grid Computing Grid computing is increasingly perceived as the main enabling technology for facilitating multi-institutional.
…building the next IT revolution From Web to Grid…
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
The Astro-Wise Federation … staying connected Lorentz center, Monday, 31 March 2008.
119 May 2003HEPiX/HEPNT National Institute for Nuclear Physics and High Energy Physics Coordinates all (experimental) subatomic physics research in The.
Slide David Britton, University of Glasgow IET, Oct 09 1 Prof. David Britton GridPP Project leader University of Glasgow UK-T0 Meeting 21 st Oct 2015 GridPP.
NORDUnet NORDUnet e-Infrastrucure: Grids and Hybrid Networks Lars Fischer CTO, NORDUnet Fall 2006 Internet2 Member Meeting, Chicago.
Analysis of job submissions through the EGEE Grid Overview The Grid as an environment for large scale job execution is now moving beyond the prototyping.
Research organization technology David Groep, October 2007.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
Northwest Indiana Computational Grid Preston Smith Rosen Center for Advanced Computing Purdue University - West Lafayette West Lafayette Calumet.
Grid Computing Jeff Templon Programme: Group composition (current): 2 staff, 10 technicians, 1 PhD. Publications: 2 theses (PD Eng.) 16 publications.
DutchGrid KNMI KUN Delft Leiden VU ASTRON WCW Utrecht Telin Amsterdam Many organizations in the Netherlands are very active in Grid usage and development,
Enabling Grids for E-sciencE INFSO-RI Dr. Rüdiger Berlich Forschungszentrum Karslruhe Introduction to Grid Computing Christopher.
J. Templon Nikhef Amsterdam Physics Data Processing Group Large Scale Computing Jeff Templon Nikhef Jamboree, Utrecht, 10 december 2012.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
Grids and SMEs: Experience and Perspectives Emanouil Atanassov, Todor Gurov, and Aneta Karaivanova Institute for Parallel Processing, Bulgarian Academy.
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
LHC collisions rate: Hz New PHYSICS rate: Hz Event selection: 1 in 10,000,000,000,000 Signal/Noise: Raw Data volumes produced.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI Overview for ENVRI Gergely Sipos, Malgorzata Krakowian EGI.eu
(BiG) Grid: from research to e-Science David Groep, NIKHEF Graphics: Real Time Monitor, Gidon Moont, Imperial College London, see
EGI-InSPIRE RI An Introduction to European Grid Infrastructure (EGI) March An Introduction to the European Grid Infrastructure.
ScotGRID is the Scottish prototype Tier 2 Centre for LHCb and ATLAS computing resources. It uses a novel distributed architecture and cutting-edge technology,
Grid Computing at NIKHEF Shipping High-Energy Physics data, be it simulated or measured, required strong national and trans-Atlantic.
EGI… …is a Federation of over 300 computing and data centres spread across 56 countries in Europe and worldwide …delivers advanced computing.
BiG Grid: the Dutch production grid David Groep, NIKHEF Graphics: Real Time Monitor, Gidon Moont, Imperial College London, see
Accessing the VI-SEEM infrastructure
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
Resource Management IB Computer Science.
Clouds , Grids and Clusters
A Dutch LHC Tier-1 Facility
Volunteer Computing for Science Gateways
Grid site as a tool for data processing and data analysis
PROGRAMME 10:00 Introduction to presentations and tour (10‘) Francois Grey  10:10 CERN openlab student programme - CERN opencluster (05')    Stephen Eccles 
The LHC Computing Grid Visit of Mtro. Enrique Agüera Ibañez
LinkSCEEM-2: A computational resource for the development of Computational Sciences in the Eastern Mediterranean Mostafa Zoubi SESAME Outreach SESAME,
One independent ‘policy-bridge’ PKI
Dagmar Adamova, NPI AS CR Prague/Rez
Clouds of JINR, University of Sofia and INRNE Join Together
The LHC Computing Grid Visit of Her Royal Highness
Grid Computing.
Dagmar Adamova (NPI AS CR Prague/Rez) and Maarten Litmaath (CERN)
UK Status and Plans Scientific Computing Forum 27th Oct 2017
Recap: introduction to e-science
A high-performance computing facility for scientific research
The CERN openlab and the European DataGrid Project
Presentation transcript:

Grid Computing: dealing with GB/s dataflows David Groep, NIKHEF Graphics: Real Time Monitor, Gidon Moont, Imperial College London, see Jan Just Keijser, Nikhef 3 May 2012

LHC Computing Large Hadron Collider ‘the worlds largest microscope’ 'looking at the fundamental forces of nature’ 27 km circumference CERN, Genève atom m nucleus quarks ~ 20 PByte of data per year, ~ modern PC style computers

Atlas Trigger Design Level 1 – Hardware based, online – Accepts 75 KHz, latency 2.5 ms – 160 GB/s Level 2 – 500 Processor farm – Accepts 2 KHz, latency 10 ms – 5 GB/s Event Filter – 1600 processor farm – Accepts 200 Hz, ~1 s per event – Incorporates alignment, calibration – 300 MB/s From: The ATLAS trigger system, Srivas Prasad

Signal/Background Data volume ● (high rate) X (large number of channels) X (4 experiments)  20 PetaBytes new data per year Compute power ● (event complexity) X (number of events) X (thousands of users)  processors Concorde (15 Km) Balloon (30 Km) Stack of CDs w/ 1 year LHC data! (~ 20 Km) Mt. Blanc (4.8 Km)

Scientific Compute e-Infrastructure From: Key characteristics of SARA and BiG Grid Compute services Task parallelism (also known as function parallelism and control parallelism) is a form of parallelization of computer code across multiple processors in parallel computing environments. Task parallelism focuses on distributing execution processes (threads) across different parallel computing nodes. Data parallelism (also known as loop-level parallelism) is a form of parallelization of computing across multiple processors in parallel computing environments. Data parallelism focuses on distributing the data across different parallel computing nodes.

What is BiG Grid? Collaborative effort of the NBIC, NCF and Nikhef. Aims to set up a grid infrastructure for scientific research. This research infrastructure contains compute clusters, data storage, combined with specific middleware and software to enable research which needs more than just raw computing power or data storage. We aim to assist scientists from all backgrounds in exploring and using the opportunities offered by the Dutch e-science grid.

Nikhef (NDPF) 3336 processor cores 1600TByte disk 160Gbps network SARA (GINA+LISA) 3000 processor cores 1800TByte disk 2000TByte tape 160Gbps network RUG-CIT (Grid) 400 processor cores 8 800GByte disk 10Gbps network Philips Research Ehv 1600processor cores 100TByte disk 1Gbps network

Virtual Laboratory for e-Science Avian Alert & FlySafe Willem Bouten et al. UvA Institute for Biodiversity Ecosystem Dynamics, IBED Data integration for genomics, proteomics, etc. analysis Timo Breit et al. Swammerdam Institute of Life Sciences Medical Imaging & fMRI Silvia Olabarriaga et al. AMC and UvA IvI Molecular Cell Biology & 3D Electron Microscopy Bram Koster et al. LUMC Microscopic Imaging group Image sources: VL-e Consortium Partners

BiG Grid Image sources: BiG Grid Consortium Partners SCIAMACHY Wim Som de Cerff et al. KNMI MPI Nijmegen: Psycholinguistics

Image sources: BiG Grid Consortium Partners LOFAR: LOw Frequency ARray radio telescope Leiden Grid Initiative: Computational Chemistry BiG Grid

Grid organisation National Grid Initiatives & European Grid Initiative At the national level a grid infrastructure is offered to national and international users by the NGIs. BiG Grid is (de facto) the Dutch NGI. The 'European Grid Initiative' coordinates the efforts of the different NGIs and ensures interoperability Circa 40 European NGIs, with links to South America and Taiwan Headquarter of EGI is at the Science Park in Amsterdam

Cross-domain and global e-Science grids The communities that make up the grid: not under single hierarchical control, temporarily joining forces to solve a particular problem at hand, bringing to the collaboration a subset of their resources, sharing those at their discretion and each under their own conditions.

Grid especially means scaling up: Distributed computing on many, different computers, Distributed storage of data, Large amounts of data (Giga-, Tera-, Petabytes), Large number of files (millions). This gives rise to “interesting” problems: Remote logins are not always possible on the grid, Debugging a program is a challenge, Regular filesystems tend to choke on millions of files, Storing data is one thing, searching and retrieving turn out to be even bigger challenges. Challenges: scaling up

Challenges: security Why is security so important for an e-Science Infrastructure? e-Science communities are not under a single hierarchical control; As grid site administrator you are allowing relatively unknown persons to run programs on your computers; All of these computers are connected to the internet using an incredibly fast network: This makes the grid a potentially very dangerous service on the internet

Storing Petabytes of data is possible, but... Retrieving data is harder than you would expect; Organising such amounts of data is non-trivial; Applications are much smaller than the data they need to process  always bring your application to the data, if possible; The “data about the data” (metadata) becomes crucial: –location, –experimental conditions, –date and time Storing the metadata in a database can be a life-saver. Lessons Learned: Data Management

Lessons Learned: Job efficiency A recurring complaint heard about grid computing is low job efficiency (~94%). It is important to know that: Failed jobs almost always did so due to data access issues; If you remove data access issues, job efficiency jumps to ~99%, which is on par with cluster and cloud computing. Mitigation strategies: Replicate files to multiple storage systems; Pre-stage data to specific compute sites; “Program for failure”.

Lessons Learned: Network bandwidth All data taken by the LHC in CERN is replicated out to 11 Tier-1 centres around Europe. BiG Grid serves as one of those Tier-1's. We always thought and knew we have a good network, but Having a dedicated optical network (OPN) from CERN to the data storage centres (Tier-1s) turned out to be crucial; It turns out that the Network bandwidth between storage and compute clusters is equally important

Questions?