Grids at NIKHEF 2004.07.14 1 Grid NIKHEF David Groep NIKHEF PDP 2004.07.14 The (data) problem to solve beyond meta-computing: the Grid realizing.

Slides:



Advertisements
Similar presentations
HEPiX Edinburgh 28 May 2004 LCG les robertson - cern-it-1 Data Management Service Challenge Scope Networking, file transfer, data management Storage management.
Advertisements

Grid Computing Test beds in Europe and the Netherlands David Groep, NIKHEF
Grid Jeff Templon PDP Group, NIKHEF NIKHEF Jamboree 22 december 2005 Throbbing jobsGoogled Grid.
The DutchGrid Platform Collaboration of projects from –Computer Science, HEP and service providers Participating and supported projects –Virtual Laboratory.
Plateforme de Calcul pour les Sciences du Vivant SRB & gLite V. Breton.
Alain Romeyer - Dec Grid computing for CMS What is the Grid ? Let’s start with an analogy How it works ? (Some basic ideas) Grid for LHC and CMS.
An Introduction to Grid Computing - David Groep Grid Computing Introduction David Groep NIKHEF Physics Data Processing Group.
Grids: Why and How (you might use them) J. Templon, NIKHEF VLV T Workshop NIKHEF 06 October 2003.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Infrastructure overview Arnold Meijster &
Large-Scale Distributed Computing in the Netherlands an overview David Groep, NIKHEF.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Dr Ian Bird LCG Project Leader 15 th April 2009 Visit of Spanish Royal Academy.
Application Use Cases NIKHEF, Amsterdam, December 12, 13.
CERN/IT/DB Multi-PB Distributed Databases Jamie Shiers IT Division, DB Group, CERN, Geneva, Switzerland February 2001.
Hall D Online Data Acquisition CEBAF provides us with a tremendous scientific opportunity for understanding one of the fundamental forces of nature. 75.
Using Grid Computing David Groep, NIKHEF
Finnish DataGrid meeting, CSC, Otaniemi, V. Karimäki (HIP) DataGrid meeting, CSC V. Karimäki (HIP) V. Karimäki (HIP) Otaniemi, 28 August, 2000.
IST E-infrastructure shared between Europe and Latin America High Energy Physics Applications in EELA Raquel Pezoa Universidad.
Dutch Tier Hardware Farm size –now: 150 dual nodes + scavenging 200 nodes –buildup to ~1500 up-to-date nodes in 2007 Network –now: 2 Gbit/s internatl.
Data Grid projects in HENP R. Pordes, Fermilab Many HENP projects are working on the infrastructure for global distributed simulated data production, data.
DataTAG Research and Technological Development for a Transatlantic Grid Abstract Several major international Grid development projects are underway at.
Grid Computing Status Report Jeff Templon PDP Group, NIKHEF NIKHEF Scientific Advisory Committee 20 May 2005.
Using the VL-E Proof of Concept Environment Connecting Users to the e-Science Infrastructure David Groep, NIKHEF.
SouthGrid SouthGrid SouthGrid is a distributed Tier 2 centre, one of four setup in the UK as part of the GridPP project. SouthGrid.
Tier-2  Data Analysis  MC simulation  Import data from Tier-1 and export MC data CMS GRID COMPUTING AT THE SPANISH TIER-1 AND TIER-2 SITES P. Garcia-Abia.
GridPP Deployment & Operations GridPP has built a Computing Grid of more than 5,000 CPUs, with equipment based at many of the particle physics centres.
Large-Scale Computing with Grids Jeff Templon KVI Seminar 17 February 2004.
Using Grid Computing at NIKHEF David Groep, NIKHEF
The DutchGrid Platform – An Overview – 1 DutchGrid today and tomorrow David Groep, NIKHEF The DutchGrid Platform Large-scale Distributed Computing.
EGEE is a project funded by the European Union under contract IST HEP Use Cases for Grid Computing J. A. Templon Undecided (NIKHEF) Grid Tutorial,
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
The Scaling and Validation Programme PoC David Groep & vle-pfour-team VL-e Workshop NIKHEF SARA LogicaCMG IBM.
Key prototype applications Grid Computing Grid computing is increasingly perceived as the main enabling technology for facilitating multi-institutional.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
Data reprocessing for DZero on the SAM-Grid Gabriele Garzoglio for the SAM-Grid Team Fermilab, Computing Division.
Owen SyngeTitle of TalkSlide 1 Storage Management Owen Synge – Developer, Packager, and first line support to System Administrators. Talks Scope –GridPP.
…building the next IT revolution From Web to Grid…
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Les Les Robertson LCG Project Leader High Energy Physics using a worldwide computing grid Torino December 2005.
Ian Bird LCG Deployment Area Manager & EGEE Operations Manager IT Department, CERN Presentation to HEPiX 22 nd October 2004 LCG Operations.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
CERN – IT Department CH-1211 Genève 23 Switzerland t Working with Large Data Sets Tim Smith CERN/IT Open Access and Research Data Session.
Scaling and Validation Programme David Groep & vle-pfour-team VL-e SP Meeting NIKHEF SARA LogicaCMG IBM.
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
119 May 2003HEPiX/HEPNT National Institute for Nuclear Physics and High Energy Physics Coordinates all (experimental) subatomic physics research in The.
Grid and VOs. Grid from feet The GRID: networked data processing centres and ”middleware” software as the “glue” of resources. Researchers perform.
High Energy FermiLab Two physics detectors (5 stories tall each) to understand smallest scale of matter Each experiment has ~500 people doing.
7. Grid Computing Systems and Resource Management
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
INFSO-RI Enabling Grids for E-sciencE The EGEE Project Owen Appleton EGEE Dissemination Officer CERN, Switzerland Danish Grid Forum.
Physics Data Processing at NIKHEF Jeff Templon WAR 7 May 2004.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
Research organization technology David Groep, October 2007.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
ALICE Physics Data Challenge ’05 and LCG Service Challenge 3 Latchezar Betev / ALICE Geneva, 6 April 2005 LCG Storage Management Workshop.
DutchGrid KNMI KUN Delft Leiden VU ASTRON WCW Utrecht Telin Amsterdam Many organizations in the Netherlands are very active in Grid usage and development,
J. Templon Nikhef Amsterdam Physics Data Processing Group Large Scale Computing Jeff Templon Nikhef Jamboree, Utrecht, 10 december 2012.
10-Feb-00 CERN HepCCC Grid Initiative ATLAS meeting – 16 February 2000 Les Robertson CERN/IT.
CrossGrid Workshop, Kraków, 5 – 6 Nov-2001 Distributed Data Analysis in HEP Piotr MALECKI Institute of Nuclear Physics Kawiory 26A, Kraków, Poland.
(BiG) Grid: from research to e-Science David Groep, NIKHEF Graphics: Real Time Monitor, Gidon Moont, Imperial College London, see
Grid Computing: dealing with GB/s dataflows David Groep, NIKHEF Graphics: Real Time Monitor, Gidon Moont, Imperial College London, see
Grid Computing at NIKHEF Shipping High-Energy Physics data, be it simulated or measured, required strong national and trans-Atlantic.
Bob Jones EGEE Technical Director
“A Data Movement Service for the LHC”
A Dutch LHC Tier-1 Facility
EGEE support for HEP and other applications
LHC Data Analysis using a worldwide computing grid
Presentation transcript:

Grids at NIKHEF Grid NIKHEF David Groep NIKHEF PDP The (data) problem to solve beyond meta-computing: the Grid realizing the Grid at NIKHEF towards a national infrastructure

Grids at NIKHEF  Place event info on 3D map  Trace trajectories through hits  Assign type to each track  Find particles you want  Needle in a haystack!  This is “relatively easy” case A Glimpse of the Problem in HEP

Grids at NIKHEF The HEP reality

Grids at NIKHEF HEP Data Rates level 1 - special hardware 40 MHz (40 TB/sec) level 2 - embedded processors level 3 - PCs 75 KHz (75 GB/sec) 5 KHz (5 GB/sec) 100 Hz (100 MB/sec) data recording & offline analysis Reconstruct & analyze 1 event takes about 90 s Maybe only a few out of a million are interesting. But we have to check them all! Analysis program needs lots of calibration; determined from inspecting results of first pass.  Each event will be analyzed several times! Raw data rate ~ 5PByte/yr/expt. total volume: ~20 Pbyte/yr per major centre: ~2 PByte/yr The ATLAS experiment

Grids at NIKHEF Data handling and computation interactive physics analysis batch physics analysis batch physics analysis detector event summary data raw data event reprocessing event reprocessing event simulation event simulation analysis objects (extracted by physics topic) event filter (selection & reconstruction) event filter (selection & reconstruction) processed data

Grids at NIKHEF HEP is not unique in generating data LOFAR : 200 MHz,12 bits,25k antennas: 60Tbit/s Envisat GOME : ~ 5TByte/year Materials analysis (mass spectroscopy, &c): ~ 2GByte/10min fMRI, PET/MEG, …  LHC data volume necessitates ‘provenance’ and meta-data  information/data ratio even higher in other disciplines  both data and information ownership distributed access right for valuable data, add privacy for medical data

Grids at NIKHEF Beyond meta-computing: the Grid How can the Grid help? via resource accessibility and via sharing A grid integrates resources that are –not owned or administered by one single organisation –speak a common, open protocol … that is generic –working as a coordinated, transparent system And … –can be used by many people from multiple organisations –that work together in one Virtual Organisation

Grids at NIKHEF Virtual Organisations A set of individuals or organisations, not under single hierarchical control, temporarily joining forces to solve a particular problem at hand, bringing to the collaboration a subset of their resources, sharing those at their discretion and each under their own conditions. A VO is a temporary alliance of stakeholders – Users – Service providers – Information Providers

Grids at NIKHEF Common and open protocols Applications Grid Services GRAM Grid Security Infrastructure (GSI) Grid Fabric FARMSSupersDesktopsTCP/IPApparatus Application Toolkits DUROCMPICH-G2Condor-G GridFTPInformation VLAM-G Replica DBs

Grids at NIKHEF Standard protocols New Grid protocols based on popular Web Services Web Services Resource Framework (WSRF) Grid adds concept of ‘stateful resources’, like grid-jobs, data elements & data bases, … Ensure adequate and flexible standards today via the Global Grid Forum Future developments taken up by industry

Grids at NIKHEF Access in a coordinated way Transparently crossing of domain boundaries satisfying constraints of –site autonomy –authenticity, integrity, confidentiality single sign-on to all services ways to address services collectively APIs at the application level every desktop, laptop, disk is part of the Grid

Grids at NIKHEF Realization: projects at NIKHEF Virtual Lab for e-Science (BSIK) – Enabling Grids for e-Science in Europe (FP6) – /2007 GigaPort NG Network (BSIK) – NL-Grid Infrastructure (NCF) –2002-… EU DataGrid (FP5, finished) –

Grids at NIKHEF Research threads 1.end-to-end operation for data-intensive sciences ( DISc ): –data acquisition – ATLAS Level-3 –wide-area transport, on-line and near-line storage – LGC SC –data cataloguing and meta-data – D0 SAM –common API and application layer for DISc – EGEE App+VL-E 2.design scalable and generic Grids –grid software scalability research, security 3.deployment and certification –large-scale clusters, storage, networking

Grids at NIKHEF End to End – the LCG Service Challenge 10 Pbyte per year exported from CERN (ready in 2006) Targets for end 2004 – 1.SRM-SRM (disk) on 10 Gbps links between CERN, NIKHEF/SARA, Triumf, FZK, FNAL  500 Mb/sec sustained for days 2.Reliable data transfer service 3.Mass storage system mass storage system 1.SRM v.1 at all sites 2.disk-disk, disk-tape, tape-tape 4.Permanent service in operation sustained load (mixed user and generated workload) > 10 sites key target is reliability load level targets to be set slide: Alan Silverman, CERN

Grids at NIKHEF Networking and security 2x10Gbit/s Amsterdam-Chicago 1x10Gbit/s Amsterdam-CERN –ATLAS 3 rd level trigger (distributed DACQ) –protocol tuning and optimization –Monitoring and micro-metering –LCG service challenge: sustained high-throughput collaboration with Cees de Laat (UvA AIR) + SURFnet ideal laboratory for our security thread (many domains)

Grids at NIKHEF Building the Grid The Grid is not a magic source of power! –Need to invest in storage, CPUs, networks –LHC needs per major centre (assume 10 per expt.): ~ 3 PByte/yr, ~40 Gbit/s WAN, ~ P4-class 2GHz –… more for a national multi-disciplinary facility –Collaborative build-up of expertise: NIKHEF, SARA, NCF, UvA, VU, KNMI, ASTRON, AMOLF, ASCI, … –Resources: NIKHEF resources + NCF’s NL-Grid initiative + …

Grids at NIKHEF Resources today (the larger ones) 1.2 PByte near-line StorageTek 36 node IA32 cluster ‘matrix’ 468 CPU IA CPU MIPS multi-Gbit links to 100TByte cache 7 TByte cache 140 nodes IA32 1Gbit link SURFnet multiple links with SARA only resources with either GridFTP or Grid job management

Grids at NIKHEF A Facility for e-Science Many (science) application with large data volumes: –Life Sciences: micro-arrays (Utrecht, SILS Amsterdam) –Medical imaging: functional MRI (AMC), MEG (VU) –‘omics’ and molecular characterization: sequencing (Erasmus), mass spectroscopy (AMOLF), electron microscopy (Delft, Utrecht)  today such groups are not yet equipped to deal with their >1TByte data sets, our DISc experience can help Common need for multi-Pbyte storage ubiquitous networks for data exchange sufficient compute power, accessible from anywhere

Grids at NIKHEF Common needs and solutions? VL-E Proof of Concept environment for e-Science grid services address the common needs (storage, computing, indexing) application can rely on a stable infrastructure valuable experience as input to industry (mainly industrial research) can increasingly leverage emerging industry tools the Grid will be a household term like the Web by pushing on the PByte leading edge, TByte-sized storage will be an e-Science commodity

Grids at NIKHEF NIKHEF PDP Team in no particular order: End-to-end applications: Templon, Bos, Grijpink, Klous Security: Groep, Steenbakkers, Koeroo, Venekamp Facilities: Salomoni, Heubers, Damen, Kuipers, v.d. Akker, Harapan Scaling and certification: Groep, Starink embedded in both the physics and the computing groups