Download presentation
Presentation is loading. Please wait.
Published byRoy Reeves Modified over 9 years ago
1
Grids at NIKHEF 2004.07.14 1 Grid Computing @ NIKHEF David Groep NIKHEF PDP 2004.07.14 The (data) problem to solve beyond meta-computing: the Grid realizing the Grid at NIKHEF towards a national infrastructure
2
Grids at NIKHEF 2004.07.14 2 Place event info on 3D map Trace trajectories through hits Assign type to each track Find particles you want Needle in a haystack! This is “relatively easy” case A Glimpse of the Problem in HEP
3
Grids at NIKHEF 2004.07.14 3 The HEP reality
4
Grids at NIKHEF 2004.07.14 4 HEP Data Rates level 1 - special hardware 40 MHz (40 TB/sec) level 2 - embedded processors level 3 - PCs 75 KHz (75 GB/sec) 5 KHz (5 GB/sec) 100 Hz (100 MB/sec) data recording & offline analysis Reconstruct & analyze 1 event takes about 90 s Maybe only a few out of a million are interesting. But we have to check them all! Analysis program needs lots of calibration; determined from inspecting results of first pass. Each event will be analyzed several times! Raw data rate ~ 5PByte/yr/expt. total volume: ~20 Pbyte/yr per major centre: ~2 PByte/yr The ATLAS experiment
5
Grids at NIKHEF 2004.07.14 5 Data handling and computation interactive physics analysis batch physics analysis batch physics analysis detector event summary data raw data event reprocessing event reprocessing event simulation event simulation analysis objects (extracted by physics topic) event filter (selection & reconstruction) event filter (selection & reconstruction) processed data
6
Grids at NIKHEF 2004.07.14 6 HEP is not unique in generating data LOFAR : 200 MHz,12 bits,25k antennas: 60Tbit/s Envisat GOME : ~ 5TByte/year Materials analysis (mass spectroscopy, &c): ~ 2GByte/10min fMRI, PET/MEG, … LHC data volume necessitates ‘provenance’ and meta-data information/data ratio even higher in other disciplines both data and information ownership distributed access right for valuable data, add privacy for medical data
7
Grids at NIKHEF 2004.07.14 7 Beyond meta-computing: the Grid How can the Grid help? via resource accessibility and via sharing A grid integrates resources that are –not owned or administered by one single organisation –speak a common, open protocol … that is generic –working as a coordinated, transparent system And … –can be used by many people from multiple organisations –that work together in one Virtual Organisation
8
Grids at NIKHEF 2004.07.14 8 Virtual Organisations A set of individuals or organisations, not under single hierarchical control, temporarily joining forces to solve a particular problem at hand, bringing to the collaboration a subset of their resources, sharing those at their discretion and each under their own conditions. A VO is a temporary alliance of stakeholders – Users – Service providers – Information Providers
9
Grids at NIKHEF 2004.07.14 9 Common and open protocols Applications Grid Services GRAM Grid Security Infrastructure (GSI) Grid Fabric FARMSSupersDesktopsTCP/IPApparatus Application Toolkits DUROCMPICH-G2Condor-G GridFTPInformation VLAM-G Replica DBs
10
Grids at NIKHEF 2004.07.14 10 Standard protocols New Grid protocols based on popular Web Services Web Services Resource Framework (WSRF) Grid adds concept of ‘stateful resources’, like grid-jobs, data elements & data bases, … Ensure adequate and flexible standards today via the Global Grid Forum Future developments taken up by industry
11
Grids at NIKHEF 2004.07.14 11 Access in a coordinated way Transparently crossing of domain boundaries satisfying constraints of –site autonomy –authenticity, integrity, confidentiality single sign-on to all services ways to address services collectively APIs at the application level every desktop, laptop, disk is part of the Grid
12
Grids at NIKHEF 2004.07.14 12 Realization: projects at NIKHEF Virtual Lab for e-Science (BSIK) –2004-2008 Enabling Grids for e-Science in Europe (FP6) –2004-2005/2007 GigaPort NG Network (BSIK) –2004-2008 NL-Grid Infrastructure (NCF) –2002-… EU DataGrid (FP5, finished) –2001-2003
13
Grids at NIKHEF 2004.07.14 13 Research threads 1.end-to-end operation for data-intensive sciences ( DISc ): –data acquisition – ATLAS Level-3 –wide-area transport, on-line and near-line storage – LGC SC –data cataloguing and meta-data – D0 SAM –common API and application layer for DISc – EGEE App+VL-E 2.design scalable and generic Grids –grid software scalability research, security 3.deployment and certification –large-scale clusters, storage, networking
14
Grids at NIKHEF 2004.07.14 14 End to End – the LCG Service Challenge 10 Pbyte per year exported from CERN (ready in 2006) Targets for end 2004 – 1.SRM-SRM (disk) on 10 Gbps links between CERN, NIKHEF/SARA, Triumf, FZK, FNAL 500 Mb/sec sustained for days 2.Reliable data transfer service 3.Mass storage system mass storage system 1.SRM v.1 at all sites 2.disk-disk, disk-tape, tape-tape 4.Permanent service in operation sustained load (mixed user and generated workload) > 10 sites key target is reliability load level targets to be set slide: Alan Silverman, CERN
15
Grids at NIKHEF 2004.07.14 15 Networking and security 2x10Gbit/s Amsterdam-Chicago 1x10Gbit/s Amsterdam-CERN –ATLAS 3 rd level trigger (distributed DACQ) –protocol tuning and optimization –Monitoring and micro-metering –LCG service challenge: sustained high-throughput collaboration with Cees de Laat (UvA AIR) + SURFnet ideal laboratory for our security thread (many domains)
16
Grids at NIKHEF 2004.07.14 16 Building the Grid The Grid is not a magic source of power! –Need to invest in storage, CPUs, networks –LHC needs per major centre (assume 10 per expt.): ~ 3 PByte/yr, ~40 Gbit/s WAN, ~15 000 P4-class 2GHz –… more for a national multi-disciplinary facility –Collaborative build-up of expertise: NIKHEF, SARA, NCF, UvA, VU, KNMI, ASTRON, AMOLF, ASCI, … –Resources: NIKHEF resources + NCF’s NL-Grid initiative + …
17
Grids at NIKHEF 2004.07.14 17 Resources today (the larger ones) 1.2 PByte near-line StorageTek 36 node IA32 cluster ‘matrix’ 468 CPU IA64 + 1024 CPU MIPS multi-Gbit links to 100TByte cache 7 TByte cache 140 nodes IA32 1Gbit link SURFnet multiple links with SARA only resources with either GridFTP or Grid job management
18
Grids at NIKHEF 2004.07.14 18 A Facility for e-Science Many (science) application with large data volumes: –Life Sciences: micro-arrays (Utrecht, SILS Amsterdam) –Medical imaging: functional MRI (AMC), MEG (VU) –‘omics’ and molecular characterization: sequencing (Erasmus), mass spectroscopy (AMOLF), electron microscopy (Delft, Utrecht) today such groups are not yet equipped to deal with their >1TByte data sets, our DISc experience can help Common need for multi-Pbyte storage ubiquitous networks for data exchange sufficient compute power, accessible from anywhere
19
Grids at NIKHEF 2004.07.14 19 Common needs and solutions? VL-E Proof of Concept environment for e-Science grid services address the common needs (storage, computing, indexing) application can rely on a stable infrastructure valuable experience as input to industry (mainly industrial research) can increasingly leverage emerging industry tools the Grid will be a household term like the Web by pushing on the PByte leading edge, TByte-sized storage will be an e-Science commodity
20
Grids at NIKHEF 2004.07.14 20 NIKHEF PDP Team in no particular order: End-to-end applications: Templon, Bos, Grijpink, Klous Security: Groep, Steenbakkers, Koeroo, Venekamp Facilities: Salomoni, Heubers, Damen, Kuipers, v.d. Akker, Harapan Scaling and certification: Groep, Starink embedded in both the physics and the computing groups
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.