Download presentation
Presentation is loading. Please wait.
Published byCecil Bryant Modified over 9 years ago
1
DATA GRIDS for Science and Engineering Worldwide Analysis at Regional Centers Harvey B. Newman Professor of Physics, Caltech Islamabad, August 21, 2000
2
LHC Vision: Data Grid Hierarchy Tier 1 Tier2 Center Online System Offline Farm, CERN Computer Ctr > 20 TIPS FranceCentre FNAL Center Italy Center UK Center Institute Institute ~0.25TIPS Workstations ~100 MBytes/sec ~2.5 Gbits/sec 100 - 1000 Mbits/sec Bunch crossing per 25 nsecs; 100 triggers per second. Event is ~1 MByte in size Physicists work on analysis “channels” Each institute has ~10 physicists working on one or more channels Physics data cache ~PByte/sec ~0.6-2.5 Gbits/sec Tier2 Center ~622 Mbits/sec Tier 0 +1 Tier 3 Tier 4 Tier2 Center Tier 2 Experiment
3
On-demand creation of powerful virtual computing and data systems Grid: Flexible, high- performance access to all significant resources Sensor nets http:// Web: Uniform access to HTML documents Data Stores Computers Software catalogs Colleagues Grids: Next Generation Web
4
u RD45, GIODNetworked Object Databases u Clipper/GC High speed access, processing and analysis FNAL/SAM of files and object data u SLAC/OOFS Distributed File System + Objectivity Interface u NILE, CondorFault Tolerant Distributed Computing u MONARCLHC Computing Models: Architecture, Simulation, Strategy u PPDGFirst Distributed Data Services and Data Grid System Prototype u ALDAPOO Database Structures & Access Methods for Astrophysics and HENP Data u GriPhyN Production-Scale Data Grids u EU Data Grid Roles of Projects for HENP Distributed Analysis
5
Grid Services Architecture [*] GridFabric GridServices ApplnToolkits Applns Data stores, networks, computers, display devices,… ; associated local services Protocols, authentication, policy, resource discovery & management, instrumentation,...... RemoteviztoolkitRemotecomp.toolkitRemotedatatoolkitRemotesensorstoolkitRemotecollab.toolkit A Rich Set of HEP Data-Analysis Related Applications [*] Adapted from Ian Foster: there are computing grids, access (collaborative) grids, data grids,...
6
The Grid Middleware Services Concept Standard services that è Provide uniform, high-level access to a wide range of resources (including networks) è Address interdomain issues: security, policy è Permit application-level management and monitoring of end-to-end performance Broadly deployed, like Internet Protocols Enabler of application-specific tools as well as applications themselves
7
Application Example: Condor Numerical Optimization Exact solution of “nug30” quadratic assignment problem on June 16, 2000 è 14,5,28,24,1,3,16,15,10,9,21,2,4,29,25,22, 13, 26,17,30,6,20,19,8,18,7,27,12,11,23 Used “MW” framework that maps branch and bound problem to master-worker structure Condor-G delivered 3.46E8 CPU seconds in 7 days (peak 1009 processors), using parallel computers, workstations, and clusters MetaNEOS: Argonne, Northwestern, Wisconson
8
Emerging Data Grid User Communities NSF Network for Earthquake Engineering Simulation Grid (NEES) è Integrated instrumentation, collaboration, simulation Grid Physics Network (GriPhyN) è ATLAS, CMS, LIGO, SDSS Particle Physics Data Grid (PPDG) EU Data Grid Access Grid; VRVS: supporting group-based collaboration And The Human Genome Project The Earth System Grid and EOSDIS Federating Brain Data Computed Microtomography The Virtual Observatory (US + Int’l)
9
The Particle Physics Data Grid (PPDG) u First Round Goal: Optimized cached read access to 10-100 Gbytes drawn from a total data set of 0.1 to ~1 Petabyte u Matchmaking, Resource Co-Scheduling: SRB, Condor, HRM, Globus PRIMARY SITE Data Acquisition, CPU, Disk, Tape Robot REGIONAL SITE CPU, Disk, Tape Robot Site to Site Data Replication Service 100 Mbytes/sec ANL, BNL, Caltech, FNAL, JLAB, LBNL, SDSC, SLAC, U.Wisc/CS Multi-Site Cached File Access Service University CPU, Disk, Users PRIMARY SITE DAQ, Tape, CPU, Disk, Robot Regional Site Tape, CPU, Disk, Robot University CPU, Disk, Users University Users University Users University Users Regional Site Tape, CPU, Disk, Robot
10
GriPhyN: PetaScale Virtual Data Grids Build the Foundation for Petascale Virtual Data Grids Build the Foundation for Petascale Virtual Data Grids Virtual Data Tools Request Planning & Scheduling Tools Request Execution & Management Tools Transforms Distributed resources (code, storage, computers, and network ) è Resource è Management è Services Resource Management Services è Security and è Policy è Services Security and Policy Services è Other Grid è Services Other Grid Services Interactive User Tools Production Team Individual Investigator Workgroups Raw data source
11
EU-Grid Project Work Packages
12
Grid Tools for CMS “HLT” Production: A. Samar (Caltech) Distributed Job Execution and Data Handling: Goals è Transparency è Performance è Security è Fault Tolerance è Automation Submit job Replicate data Replicate data Site ASite B Site C r Jobs are executed locally or remotely r Data is always written locally r Data is replicated to remote sites Job writes data locally A.Samar, M. Hafeez (Caltech) with CERN and FNAL
13
GRIDs In 2000: Summary Grids are changing the way we do science and engineering Key services and concepts have been identified, and development has started Major IT challenges remain è Opportunities for Collaboration Transition of services and applications to production use is starting to occur In future more sophisticated integrated services and toolsets could drive advances in many fields of science and engineering High Energy Physics, facing the need for Petascale Virtual Data, is an early adopter and leading Data Grid developer
14
The GRID BOOK Book published by Morgan Kaufman è www.mkp.com/grids Globus è www.globus.org Grid Forum è www.gridforum.org
15
French GRID Initiative Partners Computing centres: è IDRIS CNRS High Performance Computing Centre è IN2P3 Computing Centre è CINES, centre de calcul intensif de l’enseignement è CRIHAN centre régional d’informatique à Rouen Network departments: è UREC CNRS network department è GIP Renater Computing Science CNRS & INRIA labs: è Université Joseph Fourier è ID-IMAG è LAAS è RESAM è LIP and PSMN (Ecole Normale Supérieure de Lyon) Industry: è Société Communication et Systèmes è EDF R&D department Applications development teams (HEP, Bioinformatics,Earth Observation): Applications development teams (HEP, Bioinformatics,Earth Observation): è IN2P3, CEA, Observatoire de Grenoble, Laboratoire de Biométrie, Institut Pierre Simon Laplace
16
LHC Tier 2 Center In 2001 OC-12
17
ESG Prototype Inter-communication Diagram LBNL GSI- wuftpd LLNL Disk PCMDI Request Manager ISI GSI- wuftpd Disk SDSC GSI- pftpd HPSSHPSS Disk ANL GSI- wuftpd Disk NCAR GSI- wuftpd Disk LBNL Disk on Clipper HPSSHPSS HRM ANL Replica Catalog GIS with NWS GSI-ncftp LDAP Script LDAP C API or Script GSI-ncftp CORBA
18
GriPhyN Scope Several scientific disciplines è US-CMSHigh Energy Physics è US-ATLASHigh Energy Physics è LIGOGravity wave experiment è SDSSSloan Digital Sky Survey Requesting $70M from NSF to build Grids è 4 Grid implementations, one per experiment è Tier2 hardware, networking, people, R&D è Common problems for different implementations è Partnership with CS professionals, IT, industry è R&D from NSF ITR Program ($12M)
19
Data Grids: Better Global Resource Use and Faster Turnaround Efficient resource use and improved responsiveness through: è Treatment of the ensemble of site and network resources as an integrated (loosely coupled) system è Resource discovery, prioritization è Data caching, query estimation, co-scheduling, transaction management è Network and site “instrumentation”: performance tracking, monitoring, problem trapping and handling
20
Emerging Production Grids NASA Information Power Grid NSF National Technology Grid
21
EU HEP Data Grid Project
22
Grid (IT) Issues to be Addressed Data caching and mirroring strategies è Object Collection Extract/Export/Transport/Import for large or highly distributed data transactions Query estimators, Query Monitors (cf. ATLAS/GC work) è Enable flexible, resilient prioritisation schemes è Query redirection, priority alteration, fragmentation, etc. Pre-Emptive and realtime data/resource matchmaking è Resource discovery è Co-scheduling and queueing State, workflow, & performance-monitoring instrumentation; tracking and forward prediction Security: Authentication (for resource allocation/usage and priority); running an international certificate authority
23
Why Now? The Internet as infrastructure è Increasing bandwidth, advanced services; a need to explore higher throughput Advances in storage capacity è A Terabyte for ~$40k (or $ 10k) Increased availability of compute resources è Dense (Web) Server-Clusters, supercomputers, etc. Advances in application concepts è Simulation-based design, advanced scientific instruments, collaborative engineering,...
24
PPDG Work at Caltech and SLAC è Work on the NTON connections between Caltech and SLAC k Test with 8 OC3 adapters on the Caltech Exemplar multiplexed across to a SLAC Cisco GSR router. Limited throughput due to small MTU in the GSR. k A Dell dual Pentium III based server with two OC12 (622 Mbps) ATM cards. Configured to allow aggregate transfer of more then 100 Mbytes/seconds in both directions Caltech SLAC. u So far reached 40 Mbytes/sec on one OC12 è Monitoring tools installed at Caltech/CACR k PingER installed to monitor WAN HEP connectivity k A Surveyor device will be installed soon, for very precise measurement of network traffic speeds è Investigations into a distributed resource management architecture that co-manages processors and data
25
Participants Main partners: CERN, INFN(I), CNRS(F), PPARC(UK), NIKHEF(NL), ESA-Earth Observation Other sciences: Earth Observation, Biology, Medicine Industrial participation: CS SI/F, DataMat/I, IBM/UK Associated partners: Czech Republic, Finland, Germany, Hungary, Spain, Sweden (mostly computer scientists) Work with US: Underway; Formal collaboration being established Industry and Research Project Forum with representatives from: è Denmark, Greece, Israel, Japan, Norway, Poland, Portugal, Russia, Switzerland
26
GriPhyN: First Production Scale “Grid Physics Network” Develop a New Form of Integrated Distributed System, while Meeting Primary Goals of the LIGO, SDSS and LHC Scientific Programs Develop a New Form of Integrated Distributed System, while Meeting Primary Goals of the LIGO, SDSS and LHC Scientific Programs u Focus on Tier2 Centers at Universities è In a Unified Hierarchical Grid of Five Levels u 18 Centers; with Four Sub-Implementations è 5 Each in US for LIGO, CMS, ATLAS; 3 for SDSS è Near Term Focus on LIGO, SDSS handling of real data; LHC “Data Challenges” with simulated data u Cooperation with PPDG, MONARC and EU Grid Project http://www.phys.ufl.edu/~avery/GriPhyN/
27
An effective collaboration between Physicists, Astronomers, and Computer Scientists Virtual Data u A hierarchy of compact data forms, user collections and remote data transformations is essential è Even with future Gbps networks u Coordination among multiple sites is required u Coherent strategies are needed for data location, transport, caching and replication, structuring, and resource co-scheduling for efficient access GriPhyN: Petascale Virtual Data Grids
28
Sloan Digital Sky Survey Data Grid Three main functions: Raw data processing on a Grid (FNAL) Raw data processing on a Grid (FNAL) è Rapid turnaround with TBs of data è Accessible storage of all image data Fast science analysis environment (JHU) Fast science analysis environment (JHU) è Combined data access + analysis of calibrated data è Distributed I/O layer and processing layer; shared by whole collaboration Public data access Public data access è SDSS data browsing for astronomers, and students è Complex query engine for the public
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.