An Introduction to The Grid Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory Oak Park River Forest High School
2 Topics l Grids in a nutshell –What are Grids –Why are we building Grids? l What Grids are made of l The Globus Project and Toolkit l How Grids are helping (big) Science
3 A Grid to Share Computing Resources
4 Authenticate once Submit a grid computation (code, resources, data,…) Locate resources Negotiate authorization, acceptable use, etc. Select and acquire resources Initiate data transfers, computation Monitor progress Steer computation Store and distribute results Account for usage Grid Applications
5 Natural Science drives Computer Science
6 Scientists write software to probe the nature of the universe
7 Data Grids for High Energy Physics Tier2 Centre ~1 TIPS Online System Offline Processor Farm ~20 TIPS CERN Computer Centre FermiLab ~4 TIPS France Regional Centre Italy Regional Centre Germany Regional Centre Institute Institute ~0.25TIPS Physicist workstations ~100 MBytes/sec ~622 Mbits/sec ~1 MBytes/sec There is a “bunch crossing” every 25 nsecs. There are 100 “triggers” per second Each triggered event is ~1 MByte in size Physicists work on analysis “channels”. Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server Physics data cache ~PBytes/sec ~622 Mbits/sec or Air Freight (deprecated) Tier2 Centre ~1 TIPS Caltech ~1 TIPS ~622 Mbits/sec Tier 0 Tier 1 Tier 2 Tier 4 1 TIPS is approximately 25,000 SpecInt95 equivalents Image courtesy Harvey Newman, Caltech
8 The Grid l Emerging computational and networking infrastructure –Pervasive, uniform, and reliable access to remote data, computational, sensor, and human resources l Enable new approaches to applications and problem solving –Remote resources the rule, not the exception l Challenges –Many different computers and operating systems –Failures are common – something is always broken –Different organizations have different rules for security and computer usage
9 Motivation Sharing the computing power of multiple organizations to help virtual organizations solve big problems
Elements of the Problem l Resource sharing –Computers, storage, sensors, networks, … –Sharing always conditional: issues of trust, policy, negotiation, payment, … l Coordinated problem solving –Beyond client-server: distributed data analysis, computation, collaboration, … l Dynamic, multi-institutional virtual orgs –Community overlays on classic org structures –Large or small, static or dynamic
Size of the problem l Terflops of compute power –Equal to n,000 1GHz Pentiums l Petabytes of data per year per experiment –1 PB = 25, GB Disks l 40 Gb/sec of network bandwidth – Mb/sec LAN cables (streched across the country and the Atlantic)
Program B IP network IP network Sockets – the basic building block Program A send recv send
Server: Web Server IP network IP network Services are built on Sockets Client: Web Browser send recv send Protocol: http
Server: Web Server IP network IP network Client-Server Model recv send Protocol: http Client: Web Browser send recv Client: Web Browser send recv Client: Web Browser send recv Client: Web Browser send recv Client: Web Browser send recv Client: Web Browser send recv Client: Web Browser send recv Client: Web Browser send recv Client: Web Browser send recv Client: Web Browser send recv Client: Web Browser send recv
Familiar Client-Server Apps l –Protocols: POP, SMTP l File Copying –Protocol: FTP l Logging in to remote computers –Protocol: Telnet
IP network IP network Peer-to-Peer Model Protocol: gnutella limewire send recv limewire send recv limewire send recv limewire send recv limewire send recv limewire send recv
Familiar Peer-to-Peer Apps l File (music) Sharing –Protocols: Napster, Gnutella l Chat (sort of) –Protocols: IRC, Instant Messenger l Video Conferencing –Protocols: H323
The Globus Project and The Globus Toolkit
The Globus Toolkit: Four Main Components l Grid Security Infrastructure –A trustable digital ID for every user and computer l Information Services –Find are all the computers and file servers I can use l Resource Management –Select computers and run programs on them l Data Management –Fast and secure data transfer (parallel) –Making and tracking replicas (copies) of files l …plus Common Software Infrastructure –Libraries for writing Grid software applications
Running Programs on the Grid Globus Security Infrastructure Job Manager GRAM client API calls to request resource allocation and process creation. MDS client API calls to locate resources Query current status of resource Create RSL Library Parse Request Allocate & create processes Process Monitor & control Site boundary ClientMDS: Grid Index Info Server Gatekeeper MDS: Grid Resource Info Server Local Resource Manager MDS client API calls to get resource info GRAM client API state change callbacks
The Grid Information Problem l Large numbers of distributed “sensors” with different properties l Need for different “views” of this information, depending on community membership, security constraints, intended purpose, sensor type
Grid Information Service OS
GridFTP Ubiquitous, Secure, High Performance Data Access Protocol l Common transfer protocol –all systems can exchange files with each other l VERY Fast –Send files faster than 1 Gigabit per second l Secure –Makes important data hard to damage or intercept l Applications can tailor it to their needs –Building in security or “on the fly” processing l Interfaces to many storage systems –Disk Farms, Tape Robots
Striped GridFTP Server Parallel File System (e.g. PVFS, PFS, etc.) MPI-IO … Plug-in Control GridFTP Server Parallel Backend GridFTP server master mpirun GridFTP client Plug-in Control Plug-in Control Plug-in Control … MPI (Comm_World) MPI (Sub-Comm) To Client or Another Striped GridFTP Server Control socket GridFTP Control ChannelGridFTP Data Channels
Striped GridFTP Application: Video Server
Replica Catalog Structure Logical File Parent Logical File Jan 1998 Logical Collection C02 measurements 1998 Replica Catalog Location jupiter.isi.edu Location sprite.llnl.gov Logical File Feb 1998 Size: Filename: Jan 1998 Filename: Feb 1998 … Filename: Mar 1998 Filename: Jun 1998 Filename: Oct 1998 Protocol: GridFTP UrlConstructor: GridFTP://jupiter.isi.edu/ nfs/v6/climate Filename: Jan 1998 … Filename: Dec 1998 Protocol: ftp UrlConstructor: ftp://sprite.llnl.gov/ pub/pcmdi Logical Collection C02 measurements 1999
Programming with Globus l UNIX based – Windows coming soon –Used by rest of Globus Toolkit –User can use for portability & convenience –Windows, UNIX, and Macintosh computers can all join the Grid –Portable programming very important l Event Driving Programming –A way of writing programs that handle many things at once l Parallel Programs –Wiriting programs that can utilize many computers to solve a single problem –MPI – A popular Message Passing Interface developed at Argonne and other laboratories
Grids and Applications
Hunting for Gravity Waves
Grid Communities and Applications: Network for Earthquake Eng. Simulation l NEESgrid: national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, & each other l On-demand access to experiments, data streams, computing, archives, collaboration NEESgrid: Argonne, Michigan, NCSA, UIUC, USC
The 13.6 TF TeraGrid: Computing at 40 Gb/s HPSS 5 UniTree External Networks Site Resources NCSA/PACI 8 TF 240 TB SDSC 4.1 TF 225 TB CaltechArgonne TeraGrid/DTF: NCSA, SDSC, Caltech, Argonne
iVDGL Map Circa Tier0/1 facility Tier2 facility 10+ Gbps link 2.5 Gbps link 622 Mbps link Other link Tier3 facility
Whats it like to Work on the Grid? l A fascinating problem on the frontiers of computer science l Work with people from around the world and many branches of science l Local Labs and Universities at the forefront –Argonne, Fermilab –Illinois (UIC and UIUC), U of Chicago, Northwestern –Wisconsin also very active!
Access Grid l Collaborative work among large groups l ~50 sites worldwide l Use Grid services for discovery, security l See also Ambient mic (tabletop) Presenter mic Presenter camera Audience camera Access Grid: Argonne, others
Come Visit and Explore l Argonne and Fermilab are right in our own backyard! –Visits –Summer programs
Supplementary Material
Executor Example: Condor DAGMan l Directed Acyclic Graph Manager l Specify the dependencies between Condor jobs using DAG data structure l Manage dependencies automatically –(e.g., “Don’t run job “B” until job “A” has completed successfully.”) l Each job is a “node” in DAG l Any number of parent or children nodes l No loops Job A Job BJob C Job D Slide courtesy Miron Livny, U. Wisconsin
Executor Example: Condor DAGMan (Cont.) l DAGMan acts as a “meta-scheduler” –holds & submits jobs to the Condor queue at the appropriate times based on DAG dependencies l If a job fails, DAGMan continues until it can no longer make progress and then creates a “rescue” file with the current state of the DAG –When failed job is ready to be re-run, the rescue file is used to restore the prior state of the DAG DAGMan Condor Job Queue C D B C B A Slide courtesy Miron Livny, U. Wisconsin
Virtual Data in CMS Virtual Data Long Term Vision of CMS: CMS Note 2001/047, GRIPHYN
CMS Data Analysis 100b 200b 5K 7K 100K 50K 300K 100K 50K 100K 200K 100K 100b 200b 5K 7K 100K 50K 300K 100K 50K 100K 200K 100K Tag 2 Jet finder 2 Jet finder 1 Reconstruction Algorithm Tag 1 Calibration data Raw data (simulated or real) Reconstructed data (produced by physics analysis jobs) Event 1 Event 2Event 3 Uploaded dataVirtual dataAlgorithms Dominant use of Virtual Data in the Future
Data: 0.5 MB 175 MB 275 MB 105 MB SC2001 Demo Version: pythia cmsim writeHits writeDigis 1 run = 500 events 1 run 1 event CPU: 2 min 8 hours 5 min 45 min truth.ntpl hits.fz hits.DB digis.DB Production Pipeline GriphyN-CMS Demo
GriPhyN: Virtual Data Tracking Complex Dependencies l Dependency graph is: –Files: 8 < (1,3,4,5,7), 7 < 6, (3,4,5,6) < 2 –Programs: 8 < psearch, 7 < summarize, (3,4,5) < reformat, 6 < conv, (1,2) < simulate simulate – t 10 … file1 file2 reformat – f fz … file1 File3,4,5 psearch – t 10 … conv – I esd – o aod file6 summarize – t 10 … file7 file8 Requested file
Re-creating Virtual Data l To recreate file 8: Step 1 –simulate > file1, file2 simulate – t 10 … file1 file2 reformat – f fz … file1 File3,4,5 psearch – t 10 … conv – I esd – o aod file6 summarize – t 10 … file7 file8 Requested file
Re-creating Virtual Data l To re-create file8: Step 2 –files 3, 4, 5, 6 derived from file 2 –reformat > file3, file4, file5 –conv > file 6 simulate – t 10 … file1 file2 reformat – f fz … file1 File3,4,5 psearch – t 10 … conv – I esd – o aod file6 summarize – t 10 … file7 file8 Requested file
Re-creating Virtual Data l To re-create file 8: step 3 –File 7 depends on file 6 –Summarize > file 7 simulate – t 10 … file1 file2 reformat – f fz … file1 File3,4,5 psearch – t 10 … conv – I esd – o aod file6 summarize – t 10 … file7 file8 Requested file
Re-creating Virtual Data l To re-create file 8: final step –File 8 depends on files 1, 3, 4, 5, 7 –psearch file 8 simulate – t 10 … file1 file2 psearch – t 10 … reformat – f fz … conv – I esd – o aod file1 File3,4,5 file6 summarize – t 10 … file7 file8 Requested file
Virtual Data Catalog Conceptual Data Structure TRANSFORMATION /bin/physapp1 version 1.2.3b(2) created on 12 Oct 1998 owned by physbld.orca DERIVATION ^ paramlist ^ transformation FILE LFN=filename1 PFN1=/store1/ PFN2=/store9/ PFN3=/store4/ ^derivation FILE LFN=filename2 PFN1=/store1/ PFN2=/store9/ ^derivation PARAMETER LIST PARAMETER i filename1 PARAMETER O filename2 PARAMETER E PTYPE=muon PARAMETER p -g
pythia_input pythia.exe cmsim_input cmsim.exe writeHits writeDigis begin v /usr/local/demo/scripts/cmkin_input.csh file i ntpl_file_path file i template_file file i num_events stdout cmkin_param_file end begin v /usr/local/demo/binaries/kine_make_ntpl_pyt_cms121.exe pre cms_env_var stdin cmkin_param_file stdout cmkin_log file o ntpl_file end begin v /usr/local/demo/scripts/cmsim_input.csh file i ntpl_file file i fz_file_path file i hbook_file_path file i num_trigs stdout cmsim_param_file end begin v /usr/local/demo/binaries/cms121.exe condor copy_to_spool=false condor getenv=true stdin cmsim_param_file stdout cmsim_log file o fz_file file o hbook_file end begin v /usr/local/demo/binaries/writeHits.sh condor getenv=true pre orca_hits file i fz_file file i detinput file i condor_writeHits_log file i oo_fd_boot file i datasetname stdout writeHits_log file o hits_db end begin v /usr/local/demo/binaries/writeDigis.sh pre orca_digis file i hits_db file i oo_fd_boot file i carf_input_dataset_name file i carf_output_dataset_name file i carf_input_owner file i carf_output_owner file i condor_writeDigis_log stdout writeDigis_log file o digis_db end CMS Pipeline in VDL
Virtual Data for Real Science: A Prototype Virtual Data Catalog Virtual Data Catalog (PostgreSQL) Local File Storage Virtual Data Language VDL Interpreter (VDLI) GSI Job Execution Site U of Chicago GridFTP Client Globus GRAM CondorPool Job Execution Site U of Wisconsin GridFTP Client Globus GRAM CondorPool Job Execution Site U of Florida GridFTP Client Globus GRAM CondorPool Job Sumission Sites ANL, SC, … Condor-G Agent Globus Client GridFTP Server Grid testbed Simulate Physics Simulate CMS Detector Response Copy flat-file to OODBMS Simulate Digitization of Electronic Signals Production DAG of Simulated CMS Data: Architecture of the System:
NCSA Linux cluster 5) Secondary reports complete to master Master Condor job running at Caltech 7) GridFTP fetches data from UniTree NCSA UniTree - GridFTP- enabled FTP server 4) 100 data files transferred via GridFTP, ~ 1 GB each Secondary Condor job on WI pool 3) 100 Monte Carlo jobs on Wisconsin Condor pool 2) Launch secondary job on WI pool; input files via Globus GASS Caltech workstation 6) Master starts reconstruction jobs via Globus jobmanager on cluster 8) Processed objectivity database stored to UniTree 9) Reconstruction job reports complete to master Early GriPhyN Challenge Problem: CMS Data Reconstruction Scott Koranda, Miron Livny, others
GriPhyN-LIGO SC2001 Demo
GriPhyN CMS SC2001 Demo Full Event Database of ~100,000 large objects Full Event Database of ~40,000 large objects “Tag” database of ~140,000 small objects Request Parallel tuned GSI FTP Bandwidth Greedy Grid-enabled Object Collection Analysis for Particle Physics
iVDGL l International Virtual-Data Grid Laboratory –A place to conduct Data Grid tests at scale –Concrete manifestation of world-wide grid activity –Continuing activity that will drive Grid awareness l Scale of effort –For national, intl scale Data Grid tests, operations –Computation & data intensive computing l Who –Initially US-UK-Italy-EU; Japan, Australia –& Russia, China, Pakistan, India, South America? –StarLight and other international networks vital U.S. Co-PIs: Avery, Foster, Gardner, Newman, Szalay
iVDGL Map Circa Tier0/1 facility Tier2 facility 10+ Gbps link 2.5 Gbps link 622 Mbps link Other link Tier3 facility
Summary l “Grids”: Resource sharing & problem solving in dynamic virtual organizations –Many projects now working to develop, deploy, apply relevant technologies l Common protocols and services are critical –Globus Toolkit a source of protocol and API definitions, reference implementations l Rapid progress on definition, implementation, and application of Data Grid architecture –Harmonizing U.S. and E.U. efforts important