Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Introduction to The Grid Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory Oak Park River Forest High School - 2002.0522.

Similar presentations


Presentation on theme: "An Introduction to The Grid Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory Oak Park River Forest High School - 2002.0522."— Presentation transcript:

1 An Introduction to The Grid Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory Oak Park River Forest High School - 2002.0522

2 www.globus.org 2 www.griphyn.org Topics l Grids in a nutshell –What are Grids –Why are we building Grids? l What Grids are made of l The Globus Project and Toolkit l How Grids are helping (big) Science

3 www.globus.org 3 www.griphyn.org A Grid to Share Computing Resources

4 www.globus.org 4 www.griphyn.org Authenticate once Submit a grid computation (code, resources, data,…) Locate resources Negotiate authorization, acceptable use, etc. Select and acquire resources Initiate data transfers, computation Monitor progress Steer computation Store and distribute results Account for usage Grid Applications

5 www.globus.org 5 www.griphyn.org Natural Science drives Computer Science

6 www.globus.org 6 www.griphyn.org Scientists write software to probe the nature of the universe

7 www.globus.org 7 www.griphyn.org Data Grids for High Energy Physics Tier2 Centre ~1 TIPS Online System Offline Processor Farm ~20 TIPS CERN Computer Centre FermiLab ~4 TIPS France Regional Centre Italy Regional Centre Germany Regional Centre Institute Institute ~0.25TIPS Physicist workstations ~100 MBytes/sec ~622 Mbits/sec ~1 MBytes/sec There is a “bunch crossing” every 25 nsecs. There are 100 “triggers” per second Each triggered event is ~1 MByte in size Physicists work on analysis “channels”. Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server Physics data cache ~PBytes/sec ~622 Mbits/sec or Air Freight (deprecated) Tier2 Centre ~1 TIPS Caltech ~1 TIPS ~622 Mbits/sec Tier 0 Tier 1 Tier 2 Tier 4 1 TIPS is approximately 25,000 SpecInt95 equivalents Image courtesy Harvey Newman, Caltech

8 www.globus.org 8 www.griphyn.org The Grid l Emerging computational and networking infrastructure –Pervasive, uniform, and reliable access to remote data, computational, sensor, and human resources l Enable new approaches to applications and problem solving –Remote resources the rule, not the exception l Challenges –Many different computers and operating systems –Failures are common – something is always broken –Different organizations have different rules for security and computer usage

9 www.globus.org 9 www.griphyn.org Motivation Sharing the computing power of multiple organizations to help virtual organizations solve big problems

10 www.globus.org 10 www.griphyn.org Elements of the Problem l Resource sharing –Computers, storage, sensors, networks, … –Sharing always conditional: issues of trust, policy, negotiation, payment, … l Coordinated problem solving –Beyond client-server: distributed data analysis, computation, collaboration, … l Dynamic, multi-institutional virtual orgs –Community overlays on classic org structures –Large or small, static or dynamic

11 www.globus.org 11 www.griphyn.org Size of the problem l Terflops of compute power –Equal to n,000 1GHz Pentiums l Petabytes of data per year per experiment –1 PB = 25,000 40 GB Disks l 40 Gb/sec of network bandwidth –400 100Mb/sec LAN cables (streched across the country and the Atlantic)

12 www.globus.org 12 www.griphyn.org Program B IP network IP network Sockets – the basic building block Program A send recv send

13 www.globus.org 13 www.griphyn.org Server: Web Server IP network IP network Services are built on Sockets Client: Web Browser send recv send Protocol: http

14 www.globus.org 14 www.griphyn.org Server: Web Server IP network IP network Client-Server Model recv send Protocol: http Client: Web Browser send recv Client: Web Browser send recv Client: Web Browser send recv Client: Web Browser send recv Client: Web Browser send recv Client: Web Browser send recv Client: Web Browser send recv Client: Web Browser send recv Client: Web Browser send recv Client: Web Browser send recv Client: Web Browser send recv

15 www.globus.org 15 www.griphyn.org Familiar Client-Server Apps l Email –Protocols: POP, SMTP l File Copying –Protocol: FTP l Logging in to remote computers –Protocol: Telnet

16 www.globus.org 16 www.griphyn.org IP network IP network Peer-to-Peer Model Protocol: gnutella limewire send recv limewire send recv limewire send recv limewire send recv limewire send recv limewire send recv

17 www.globus.org 17 www.griphyn.org Familiar Peer-to-Peer Apps l File (music) Sharing –Protocols: Napster, Gnutella l Chat (sort of) –Protocols: IRC, Instant Messenger l Video Conferencing –Protocols: H323

18 The Globus Project and The Globus Toolkit

19 www.globus.org 19 www.griphyn.org The Globus Toolkit: Four Main Components l Grid Security Infrastructure –A trustable digital ID for every user and computer l Information Services –Find are all the computers and file servers I can use l Resource Management –Select computers and run programs on them l Data Management –Fast and secure data transfer (parallel) –Making and tracking replicas (copies) of files l …plus Common Software Infrastructure –Libraries for writing Grid software applications

20 www.globus.org 20 www.griphyn.org Running Programs on the Grid Globus Security Infrastructure Job Manager GRAM client API calls to request resource allocation and process creation. MDS client API calls to locate resources Query current status of resource Create RSL Library Parse Request Allocate & create processes Process Monitor & control Site boundary ClientMDS: Grid Index Info Server Gatekeeper MDS: Grid Resource Info Server Local Resource Manager MDS client API calls to get resource info GRAM client API state change callbacks

21 www.globus.org 21 www.griphyn.org The Grid Information Problem l Large numbers of distributed “sensors” with different properties l Need for different “views” of this information, depending on community membership, security constraints, intended purpose, sensor type

22 www.globus.org 22 www.griphyn.org Grid Information Service OS

23 www.globus.org 23 www.griphyn.org GridFTP Ubiquitous, Secure, High Performance Data Access Protocol l Common transfer protocol –all systems can exchange files with each other l VERY Fast –Send files faster than 1 Gigabit per second l Secure –Makes important data hard to damage or intercept l Applications can tailor it to their needs –Building in security or “on the fly” processing l Interfaces to many storage systems –Disk Farms, Tape Robots

24 www.globus.org 24 www.griphyn.org Striped GridFTP Server Parallel File System (e.g. PVFS, PFS, etc.) MPI-IO … Plug-in Control GridFTP Server Parallel Backend GridFTP server master mpirun GridFTP client Plug-in Control Plug-in Control Plug-in Control … MPI (Comm_World) MPI (Sub-Comm) To Client or Another Striped GridFTP Server Control socket GridFTP Control ChannelGridFTP Data Channels

25 www.globus.org 25 www.griphyn.org Striped GridFTP Application: Video Server

26 www.globus.org 26 www.griphyn.org Replica Catalog Structure Logical File Parent Logical File Jan 1998 Logical Collection C02 measurements 1998 Replica Catalog Location jupiter.isi.edu Location sprite.llnl.gov Logical File Feb 1998 Size: 1468762 Filename: Jan 1998 Filename: Feb 1998 … Filename: Mar 1998 Filename: Jun 1998 Filename: Oct 1998 Protocol: GridFTP UrlConstructor: GridFTP://jupiter.isi.edu/ nfs/v6/climate Filename: Jan 1998 … Filename: Dec 1998 Protocol: ftp UrlConstructor: ftp://sprite.llnl.gov/ pub/pcmdi Logical Collection C02 measurements 1999

27 www.globus.org 27 www.griphyn.org Programming with Globus l UNIX based – Windows coming soon –Used by rest of Globus Toolkit –User can use for portability & convenience –Windows, UNIX, and Macintosh computers can all join the Grid –Portable programming very important l Event Driving Programming –A way of writing programs that handle many things at once l Parallel Programs –Wiriting programs that can utilize many computers to solve a single problem –MPI – A popular Message Passing Interface developed at Argonne and other laboratories

28 www.globus.org 28 www.griphyn.org Grids and Applications

29 www.globus.org 29 www.griphyn.org Hunting for Gravity Waves

30 www.globus.org 30 www.griphyn.org Grid Communities and Applications: Network for Earthquake Eng. Simulation l NEESgrid: national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, & each other l On-demand access to experiments, data streams, computing, archives, collaboration NEESgrid: Argonne, Michigan, NCSA, UIUC, USC www.neesgrid.org

31 www.globus.org 31 www.griphyn.org The 13.6 TF TeraGrid: Computing at 40 Gb/s 26 24 8 4 HPSS 5 UniTree External Networks Site Resources NCSA/PACI 8 TF 240 TB SDSC 4.1 TF 225 TB CaltechArgonne TeraGrid/DTF: NCSA, SDSC, Caltech, Argonne www.teragrid.org

32 www.globus.org 32 www.griphyn.org iVDGL Map Circa 2002-2003 Tier0/1 facility Tier2 facility 10+ Gbps link 2.5 Gbps link 622 Mbps link Other link Tier3 facility

33 www.globus.org 33 www.griphyn.org Whats it like to Work on the Grid? l A fascinating problem on the frontiers of computer science l Work with people from around the world and many branches of science l Local Labs and Universities at the forefront –Argonne, Fermilab –Illinois (UIC and UIUC), U of Chicago, Northwestern –Wisconsin also very active!

34 www.globus.org 34 www.griphyn.org Access Grid l Collaborative work among large groups l ~50 sites worldwide l Use Grid services for discovery, security l See also www.scglobal.org Ambient mic (tabletop) Presenter mic Presenter camera Audience camera Access Grid: Argonne, others www.mcs.anl.gov/FL/accessgrid

35 www.globus.org 35 www.griphyn.org Come Visit and Explore l Argonne and Fermilab are right in our own backyard! –Visits –Summer programs

36 www.globus.org 36 www.griphyn.org Supplementary Material

37 www.globus.org 37 www.griphyn.org Executor Example: Condor DAGMan l Directed Acyclic Graph Manager l Specify the dependencies between Condor jobs using DAG data structure l Manage dependencies automatically –(e.g., “Don’t run job “B” until job “A” has completed successfully.”) l Each job is a “node” in DAG l Any number of parent or children nodes l No loops Job A Job BJob C Job D Slide courtesy Miron Livny, U. Wisconsin

38 www.globus.org 38 www.griphyn.org Executor Example: Condor DAGMan (Cont.) l DAGMan acts as a “meta-scheduler” –holds & submits jobs to the Condor queue at the appropriate times based on DAG dependencies l If a job fails, DAGMan continues until it can no longer make progress and then creates a “rescue” file with the current state of the DAG –When failed job is ready to be re-run, the rescue file is used to restore the prior state of the DAG DAGMan Condor Job Queue C D B C B A Slide courtesy Miron Livny, U. Wisconsin

39 www.globus.org 39 www.griphyn.org Virtual Data in CMS Virtual Data Long Term Vision of CMS: CMS Note 2001/047, GRIPHYN 2001-16

40 www.globus.org 40 www.griphyn.org CMS Data Analysis 100b 200b 5K 7K 100K 50K 300K 100K 50K 100K 200K 100K 100b 200b 5K 7K 100K 50K 300K 100K 50K 100K 200K 100K Tag 2 Jet finder 2 Jet finder 1 Reconstruction Algorithm Tag 1 Calibration data Raw data (simulated or real) Reconstructed data (produced by physics analysis jobs) Event 1 Event 2Event 3 Uploaded dataVirtual dataAlgorithms Dominant use of Virtual Data in the Future

41 www.globus.org 41 www.griphyn.org.................................... Data: 0.5 MB 175 MB 275 MB 105 MB SC2001 Demo Version: pythia cmsim writeHits writeDigis 1 run = 500 events 1 run 1 event CPU: 2 min 8 hours 5 min 45 min truth.ntpl hits.fz hits.DB digis.DB Production Pipeline GriphyN-CMS Demo

42 www.globus.org 42 www.griphyn.org GriPhyN: Virtual Data Tracking Complex Dependencies l Dependency graph is: –Files: 8 < (1,3,4,5,7), 7 < 6, (3,4,5,6) < 2 –Programs: 8 < psearch, 7 < summarize, (3,4,5) < reformat, 6 < conv, (1,2) < simulate simulate – t 10 … file1 file2 reformat – f fz … file1 File3,4,5 psearch – t 10 … conv – I esd – o aod file6 summarize – t 10 … file7 file8 Requested file

43 www.globus.org 43 www.griphyn.org Re-creating Virtual Data l To recreate file 8: Step 1 –simulate > file1, file2 simulate – t 10 … file1 file2 reformat – f fz … file1 File3,4,5 psearch – t 10 … conv – I esd – o aod file6 summarize – t 10 … file7 file8 Requested file

44 www.globus.org 44 www.griphyn.org Re-creating Virtual Data l To re-create file8: Step 2 –files 3, 4, 5, 6 derived from file 2 –reformat > file3, file4, file5 –conv > file 6 simulate – t 10 … file1 file2 reformat – f fz … file1 File3,4,5 psearch – t 10 … conv – I esd – o aod file6 summarize – t 10 … file7 file8 Requested file

45 www.globus.org 45 www.griphyn.org Re-creating Virtual Data l To re-create file 8: step 3 –File 7 depends on file 6 –Summarize > file 7 simulate – t 10 … file1 file2 reformat – f fz … file1 File3,4,5 psearch – t 10 … conv – I esd – o aod file6 summarize – t 10 … file7 file8 Requested file

46 www.globus.org 46 www.griphyn.org Re-creating Virtual Data l To re-create file 8: final step –File 8 depends on files 1, 3, 4, 5, 7 –psearch file 8 simulate – t 10 … file1 file2 psearch – t 10 … reformat – f fz … conv – I esd – o aod file1 File3,4,5 file6 summarize – t 10 … file7 file8 Requested file

47 www.globus.org 47 www.griphyn.org Virtual Data Catalog Conceptual Data Structure TRANSFORMATION /bin/physapp1 version 1.2.3b(2) created on 12 Oct 1998 owned by physbld.orca DERIVATION ^ paramlist ^ transformation FILE LFN=filename1 PFN1=/store1/1234987 PFN2=/store9/2437218 PFN3=/store4/8373636 ^derivation FILE LFN=filename2 PFN1=/store1/1234987 PFN2=/store9/2437218 ^derivation PARAMETER LIST PARAMETER i filename1 PARAMETER O filename2 PARAMETER E PTYPE=muon PARAMETER p -g

48 www.globus.org 48 www.griphyn.org pythia_input pythia.exe cmsim_input cmsim.exe writeHits writeDigis begin v /usr/local/demo/scripts/cmkin_input.csh file i ntpl_file_path file i template_file file i num_events stdout cmkin_param_file end begin v /usr/local/demo/binaries/kine_make_ntpl_pyt_cms121.exe pre cms_env_var stdin cmkin_param_file stdout cmkin_log file o ntpl_file end begin v /usr/local/demo/scripts/cmsim_input.csh file i ntpl_file file i fz_file_path file i hbook_file_path file i num_trigs stdout cmsim_param_file end begin v /usr/local/demo/binaries/cms121.exe condor copy_to_spool=false condor getenv=true stdin cmsim_param_file stdout cmsim_log file o fz_file file o hbook_file end begin v /usr/local/demo/binaries/writeHits.sh condor getenv=true pre orca_hits file i fz_file file i detinput file i condor_writeHits_log file i oo_fd_boot file i datasetname stdout writeHits_log file o hits_db end begin v /usr/local/demo/binaries/writeDigis.sh pre orca_digis file i hits_db file i oo_fd_boot file i carf_input_dataset_name file i carf_output_dataset_name file i carf_input_owner file i carf_output_owner file i condor_writeDigis_log stdout writeDigis_log file o digis_db end CMS Pipeline in VDL

49 www.globus.org 49 www.griphyn.org Virtual Data for Real Science: A Prototype Virtual Data Catalog Virtual Data Catalog (PostgreSQL) Local File Storage Virtual Data Language VDL Interpreter (VDLI) GSI Job Execution Site U of Chicago GridFTP Client Globus GRAM CondorPool Job Execution Site U of Wisconsin GridFTP Client Globus GRAM CondorPool Job Execution Site U of Florida GridFTP Client Globus GRAM CondorPool Job Sumission Sites ANL, SC, … Condor-G Agent Globus Client GridFTP Server Grid testbed Simulate Physics Simulate CMS Detector Response Copy flat-file to OODBMS Simulate Digitization of Electronic Signals Production DAG of Simulated CMS Data: Architecture of the System:

50 www.globus.org 50 www.griphyn.org NCSA Linux cluster 5) Secondary reports complete to master Master Condor job running at Caltech 7) GridFTP fetches data from UniTree NCSA UniTree - GridFTP- enabled FTP server 4) 100 data files transferred via GridFTP, ~ 1 GB each Secondary Condor job on WI pool 3) 100 Monte Carlo jobs on Wisconsin Condor pool 2) Launch secondary job on WI pool; input files via Globus GASS Caltech workstation 6) Master starts reconstruction jobs via Globus jobmanager on cluster 8) Processed objectivity database stored to UniTree 9) Reconstruction job reports complete to master Early GriPhyN Challenge Problem: CMS Data Reconstruction Scott Koranda, Miron Livny, others

51 www.globus.org 51 www.griphyn.org GriPhyN-LIGO SC2001 Demo

52 www.globus.org 52 www.griphyn.org GriPhyN CMS SC2001 Demo         Full Event Database of ~100,000 large objects Full Event Database of ~40,000 large objects “Tag” database of ~140,000 small objects Request Parallel tuned GSI FTP Bandwidth Greedy Grid-enabled Object Collection Analysis for Particle Physics http://pcbunn.cacr.caltech.edu/Tier2/Tier2_Overall_JJB.htm

53 www.globus.org 53 www.griphyn.org iVDGL l International Virtual-Data Grid Laboratory –A place to conduct Data Grid tests at scale –Concrete manifestation of world-wide grid activity –Continuing activity that will drive Grid awareness l Scale of effort –For national, intl scale Data Grid tests, operations –Computation & data intensive computing l Who –Initially US-UK-Italy-EU; Japan, Australia –& Russia, China, Pakistan, India, South America? –StarLight and other international networks vital U.S. Co-PIs: Avery, Foster, Gardner, Newman, Szalay

54 www.globus.org 54 www.griphyn.org iVDGL Map Circa 2002-2003 Tier0/1 facility Tier2 facility 10+ Gbps link 2.5 Gbps link 622 Mbps link Other link Tier3 facility

55 www.globus.org 55 www.griphyn.org Summary l “Grids”: Resource sharing & problem solving in dynamic virtual organizations –Many projects now working to develop, deploy, apply relevant technologies l Common protocols and services are critical –Globus Toolkit a source of protocol and API definitions, reference implementations l Rapid progress on definition, implementation, and application of Data Grid architecture –Harmonizing U.S. and E.U. efforts important


Download ppt "An Introduction to The Grid Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory Oak Park River Forest High School - 2002.0522."

Similar presentations


Ads by Google