Grid Physics Network & Intl Virtual Data Grid Lab Ian Foster* For the GriPhyN & iVDGL Projects SCI PI Meeting February 18-20, 2004 *Argonne, U.Chicago,

Slides:



Advertisements
Similar presentations
SWITCH Visit to NeSC Malcolm Atkinson Director 5 th October 2004.
Advertisements

1 US activities and strategy :NSF Ron Perrott. 2 TeraGrid An instrument that delivers high-end IT resources/services –a computational facility – over.
Virtual Data and the Chimera System* Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science.
 Contributing >30% of throughput to ATLAS and CMS in Worldwide LHC Computing Grid  Reliant on production and advanced networking from ESNET, LHCNET and.
High Performance Computing Course Notes Grid Computing.
The ADAMANT Project: Linking Scientific Workflows and Networks “Adaptive Data-Aware Multi-Domain Application Network Topologies” Ilia Baldine, Charles.
GriPhyN & iVDGL Architectural Issues GGF5 BOF Data Intensive Applications Common Architectural Issues and Drivers Edinburgh, 23 July 2002 Mike Wilde Argonne.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
R. Cavanaugh GriPhyN Analysis Workshop Caltech, June, 2003 Virtual Data Toolkit.
The LHC Computing Grid Project Tomi Kauppi Timo Larjo.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN GriPhyN: Grid Physics Network and iVDGL: International Virtual Data Grid Laboratory.
Pegasus: Mapping complex applications onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.
Knowledge Environments for Science: Representative Projects Ian Foster Argonne National Laboratory University of Chicago
April 2009 OSG Grid School - RDU 1 Open Science Grid John McGee – Renaissance Computing Institute University of North Carolina, Chapel.
Computer Science Research Ian Foster University of Chicago & Argonne National Laboratory GriPhyN NSF Project Review January 2003.
The Grid as Infrastructure and Application Enabler Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
CONDOR DAGMan and Pegasus Selim Kalayci Florida International University 07/28/2009 Note: Slides are compiled from various TeraGrid Documentations.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
XCAT Science Portal Status & Future Work July 15, 2002 Shava Smallen Extreme! Computing Laboratory Indiana University.
CANS Meeting (December 1, 2004)Paul Avery1 University of Florida UltraLight U.S. Grid Projects and Open Science Grid Chinese American.
Open Science Grid For CI-Days Internet2: Fall Member Meeting, 2007 John McGee – OSG Engagement Manager Renaissance Computing Institute.
Experiment Requirements for Global Infostructure Irwin Gaines FNAL/DOE.
PASI: Mendoza, Argentina (May 17, 2005)Paul Avery1 University of Florida Integrating Universities and Laboratories In National Cyberinfrastructure.
10/20/05 LIGO Scientific Collaboration 1 LIGO Data Grid: Making it Go Scott Koranda University of Wisconsin-Milwaukee.
HEP Experiment Integration within GriPhyN/PPDG/iVDGL Rick Cavanaugh University of Florida DataTAG/WP4 Meeting 23 May, 2002.
Patrick R Brady University of Wisconsin-Milwaukee
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
OSG Services at Tier2 Centers Rob Gardner University of Chicago WLCG Tier2 Workshop CERN June 12-14, 2006.
Digital Divide Meeting (May 23, 2005)Paul Avery1 University of Florida U.S. Grid Projects: Grid3 and Open Science Grid International.
OSG Middleware Roadmap Rob Gardner University of Chicago OSG / EGEE Operations Workshop CERN June 19-20, 2006.
INFSO-RI Enabling Grids for E-sciencE The US Federation Miron Livny Computer Sciences Department University of Wisconsin – Madison.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
Miron Livny Computer Sciences Department University of Wisconsin-Madison Welcome and Condor Project Overview.
Ruth Pordes, Fermilab CD, and A PPDG Coordinator Some Aspects of The Particle Physics Data Grid Collaboratory Pilot (PPDG) and The Grid Physics Network.
Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.
10/24/2015OSG at CANS1 Open Science Grid Ruth Pordes Fermilab
Major Grid Computing Initatives Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science The.
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
GriPhyN EAC Meeting (Jan. 7, 2002)Carl Kesselman1 University of Southern California GriPhyN External Advisory Committee Meeting Gainesville,
VDT 1 The Virtual Data Toolkit 7.th EU DataGrid Internal Project Conference Heidelberg / Germany Todd Tannenbaum (Miron Livny) (Alain.
Perspectives on Grid Technology Ian Foster Argonne National Laboratory The University of Chicago.
Introduction to Grid Computing Ed Seidel Max Planck Institute for Gravitational Physics
Pegasus: Running Large-Scale Scientific Workflows on the TeraGrid Ewa Deelman USC Information Sciences Institute
The Open Science Grid OSG Ruth Pordes Fermilab. 2 What is OSG? A Consortium of people working together to Interface Farms and Storage to a Grid and Researchers.
Authors: Ronnie Julio Cole David
The Earth System Grid (ESG) Computer Science and Technologies DOE SciDAC ESG Project Review Argonne National Laboratory, Illinois May 8-9, 2003.
Pegasus: Mapping complex applications onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.
The GriPhyN Planning Process All-Hands Meeting ISI 15 October 2001.
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
Middleware Camp NMI (NSF Middleware Initiative) Program Director Alan Blatecky Advanced Networking Infrastructure and Research.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
The Particle Physics Data Grid Collaboratory Pilot Richard P. Mount For the PPDG Collaboration DOE SciDAC PI Meeting January 15, 2002.
High Energy Physics and Grids at UF (Dec. 13, 2002)Paul Avery1 University of Florida High Energy Physics.
Open Science Grid & its Security Technical Group ESCC22 Jul 2004 Bob Cowles
Alain Roy Computer Sciences Department University of Wisconsin-Madison Condor & Middleware: NMI & VDT.
Planning Ewa Deelman USC Information Sciences Institute GriPhyN NSF Project Review January 2003 Chicago.
VDT 1 The Virtual Data Toolkit Todd Tannenbaum (Alain Roy)
Miron Livny Computer Sciences Department University of Wisconsin-Madison The Role of Scientific Middleware in the Future of HEP Computing.
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
GriPhyN Project Paul Avery, University of Florida, Ian Foster, University of Chicago NSF Grant ITR Research Objectives Significant Results Approach.
U.S. Grid Projects and Involvement in EGEE Ian Foster Argonne National Laboratory University of Chicago EGEE-LHC Town Meeting,
Management & Coordination Paul Avery, Rick Cavanaugh University of Florida Ian Foster, Mike Wilde University of Chicago, Argonne
Internet2 Spring Meeting NSF Middleware Initiative Purpose To design, develop, deploy and support a set of reusable, expandable set of middleware functions.
Realizing the Promise of Grid Computing Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science.
All Hands Meeting 2005 BIRN-CC: Building, Maintaining and Maturing a National Information Infrastructure to Enable and Advance Biomedical Research.
1 Open Science Grid: Project Statement & Vision Transform compute and data intensive science through a cross- domain self-managed national distributed.
Open Science Grid Progress and Status
Pegasus and Condor Gaurang Mehta, Ewa Deelman, Carl Kesselman, Karan Vahi Center For Grid Technologies USC/ISI.
Presentation transcript:

Grid Physics Network & Intl Virtual Data Grid Lab Ian Foster* For the GriPhyN & iVDGL Projects SCI PI Meeting February 18-20, 2004 *Argonne, U.Chicago, Globus;

2 Cyberinfrastructure “A new age has dawned in scientific & engineering research, pushed by continuing progress in computing, information, and communication technology, & pulled by the expanding complexity, scope, and scale of today’s challenges. The capacity of this technology has crossed thresholds that now make possible a comprehensive “cyberinfrastructure” on which to build new types of scientific & engineering knowledge environments & organizations and to pursue research in new ways & with increased efficacy.” [Blue Ribbon Panel report, 2003] But how will we learn how to build, operate, & use it?

3 Our Approach: Experimental & Collaborative Experimental procedure: Mix together, and shake well:  Physicists* with an overwhelming need to pool resources to solve fundamental science problems  Computer scientists with a vision of a Grid that will enable virtual communities to share resources Monitor byproducts  Heat: sometimes incendiary  Light: considerable, in eScience, computer science, & cyberinfrastructure engineering  Operational cyberinfrastructure (hardware & software), with an enthusiastic and knowledgeable user community, and real scientific benefits * We use “physicist” as a generic term indicating a non-computer scientist

4 Who are the “Physicists”? – GriPhyN/iVDGL Science Drivers US-ATLAS, US-CMS (LHC expts)  Fundamental nature of matter  100s of Petabytes LIGO observatory  Gravitational wave search  100s of Terabytes Sloan Digital Sky Survey  Astronomical research  10s of Terabytes Data growth Community growth a growing number of biologists & other scientists + computer scientists needing experimental apparatus

5 Common Underlying Problem: Data-Intensive Analysis Users & resources in many institutions …  1000s of users, 100s of institutions, petascale resources … engage in collaborative data analysis  Both structured/scheduled & interactive Many overlapping virtual orgs must  Define activities  Pool resources  Prioritize tasks  Manage data  …

6 Vision & Goals Develop the technologies & tools needed to exploit a distributed cyberinfrastructure Apply and evaluate those technologies & tools in challenging scientific problems Develop the technologies & procedures to support a persistent cyberinfrastructure Create and operate a persistent cyberinfrastructure in support of diverse discipline goals GriPhyN + iVDGL + DOE Particle Physics Data Grid (PPDG) = Trillium End-to-end

7 Two Distinct but Integrated Projects Both NSF-funded, overlapping periods  GriPhyN: $11.9M (NSF) + $1.6M (match) (2000–2005) CISE  iVDGL: $13.7M (NSF) + $2M (match) (2001–2006) MPS Basic composition  GriPhyN:12 universities, SDSC, 3 labs(~80 people)  iVDGL:18 institutions, SDSC, 4 labs(~100 people)  Large overlap in people, institutions, experiments, software GriPhyN (Grid research) vs iVDGL (Grid deployment)  GriPhyN:2/3 “CS” + 1/3 “physics”( 0% H/W)  iVDGL:1/3 “CS” + 2/3 “physics”(20% H/W) Many common elements:  Directors, Advisory Committee, linked management  Virtual Data Toolkit (VDT), Grid testbeds, Outreach effort Build on the Globus Toolkit, Condor, and other technologies

8 Project Specifics: GriPhyN Develop the technologies & tools needed to exploit a distributed cyberinfrastructure Apply and evaluate those technologies & tools in challenging scientific problems Develop the technologies & procedures to support a persistent cyberinfrastructure Create and operate a persistent cyberinfrastructure in support of diverse discipline goals

Science Review Production Manager Researcher instrument Applications storage element Grid Grid Fabric storage element storage element data Services discovery sharing Virtual Data ProductionAnalysis params exec. data composition Virtual Data planning Planning ProductionAnalysis params exec. data PlanningExecution planning Execution Virtual Data Toolkit Chimera virtual data system Pegasus planner DAGman Globus Toolkit Condor Ganglia, etc. GriPhyN Overview

pythia_input pythia.exe cmsim_input cmsim.exe writeHits writeDigis begin v /usr/local/demo/scripts/cmkin_input.csh file i ntpl_file_path file i template_file file i num_events stdout cmkin_param_file end begin v /usr/local/demo/binaries/kine_make_ntpl_pyt_cms121.exe pre cms_env_var stdin cmkin_param_file stdout cmkin_log file o ntpl_file end begin v /usr/local/demo/scripts/cmsim_input.csh file i ntpl_file file i fz_file_path file i hbook_file_path file i num_trigs stdout cmsim_param_file end begin v /usr/local/demo/binaries/cms121.exe condor copy_to_spool=false condor getenv=true stdin cmsim_param_file stdout cmsim_log file o fz_file file o hbook_file end begin v /usr/local/demo/binaries/writeHits.sh condor getenv=true pre orca_hits file i fz_file file i detinput file i condor_writeHits_log file i oo_fd_boot file i datasetname stdout writeHits_log file o hits_db end begin v /usr/local/demo/binaries/writeDigis.sh pre orca_digis file i hits_db file i oo_fd_boot file i carf_input_dataset_name file i carf_output_dataset_name file i carf_input_owner file i carf_output_owner file i condor_writeDigis_log stdout writeDigis_log file o digis_db end (Early) Virtual Data Language CMS “Pipeline”

mass = 200 decay = WW stability = 1 LowPt = 20 HighPt = mass = 200 decay = WW stability = 1 event = 8 mass = 200 decay = WW stability = 1 plot = 1 mass = 200 decay = WW plot = 1 mass = 200 decay = WW event = 8 mass = 200 decay = WW stability = 1 mass = 200 decay = WW stability = 3 mass = 200 decay = WW mass = 200 decay = ZZ mass = 200 decay = bb mass = 200 plot = 1 mass = 200 event = 8...The scientist adds a new derived data branch......and continues to investigate… Search for WW decays of the Higgs Boson and where only stable, final state particles are recorded: stability = 1 Scientist discovers an interesting result – wants to know how it was derived. Virtual Data Example: High Energy Physics Work and slide by Rick Cavanaugh and Dimitri Bourilkov, University of Florida

12 Galaxy cluster size distribution Task Graph Virtual Data Example: Sloan Galaxy Cluster Analysis Sloan Data Jim Annis, Steve Kent, Vijay Sehkri, Neha Sharma, Fermilab, Michael Milligan, Yong Zhao, Chicago

13 Virtual Data Example: NVO/NASA Montage A small (1200 node) workflow Construct custom mosaics on demand from multiple data sources User specifies projection, coordinates, size, rotation, spatial sampling Work by Ewa Deelman et al., USC/ISI and Caltech

14 Virtual Data Example: Education (Work in Progress) “We uploaded the data to the Grid & used the grid analysis tools to find the shower”

15 Project Specifics: iVDGL Develop the technologies & tools needed to exploit a distributed cyberinfrastructure Apply and evaluate those technologies & tools in challenging scientific problems Develop the technologies & procedures to support a persistent cyberinfrastructure Create and operate a persistent cyberinfrastructure in support of diverse discipline goals

16 iVDGL Goals Deploy a Grid laboratory  Support research mission of data-intensive expts  Computing & personnel resources at university sites  Provide platform for computer science development  Prototype and deploy a Grid Operations Center Integrate Grid software tools  Into computing infrastructures of the experiments Support delivery of Grid technologies  Harden VDT & other middleware technologies developed by GriPhyN and other Grid projects Education and Outreach  Enable underrepresented groups & remote regions to participate in international science projects

17 Virtual Data Toolkit Sources (CVS) Patching GPT src bundles NMI Build & Test Condor pool (37 computers) … Build Test Package VDT Build Contributors (VDS, etc.) Build Pacman cache RPMs Binaries Test Will use NMI processes soon A unique laboratory for managing, testing, supporting, deploying, packaging, upgrading, & troubleshooting complex sets of software!

18 Virtual Data Toolkit: Tools in VDT Condor Group  Condor/Condor-G  DAGMan  Fault Tolerant Shell  ClassAds Globus Alliance  Grid Security Infrastructure (GSI)  Job submission (GRAM)  Information service (MDS)  Data transfer (GridFTP)  Replica Location (RLS) EDG & LCG  Make Gridmap  Cert. Revocation List Updater  Glue Schema/Info provider ISI & UC  Chimera & related tools  Pegasus NCSA  MyProxy  GSI OpenSSH LBL  PyGlobus  Netlogger Caltech  MonaLisa VDT  VDT System Profiler  Configuration software Others  KX509 (U. Mich.)

19 VDT Growth VDT 1.0 Globus 2.0b Condor VDT 1.1.3, & pre-SC 2002 VDT Switch to Globus 2.2 VDT Grid2003 VDT First real use by LCG

20 Grid2003: An Operational Grid  28 sites ( CPUs) & growing  concurrent jobs  7 substantial applications + CS experiments  Running since October 2003 Korea

21 Grid2003 Components Computers & storage at 28 sites (to date)  CPUs Uniform service environment at each site  Globus Toolkit provides basic authentication, execution management, data movement  Pacman installation system enables installation of numerous other VDT and application services Global & virtual organization services  Certification & registration authorities, VO membership services, monitoring services Client-side tools for data access & analysis  Virtual data, execution planning, DAG management, execution management, monitoring IGOC: iVDGL Grid Operations Center

22 Grid2003 Metrics MetricTargetAchieved Number of CPUs (28 sites) Number of users> (16) Number of applications> 410 (+CS) Number of sites running concurrent apps > 1017 Peak number of concurrent jobs Data transfer per day> 2-3 TB4.4 TB max

23 Grid2003 Applications To Date CMS proton-proton collision simulation ATLAS proton-proton collision simulation LIGO gravitational wave search SDSS galaxy cluster detection ATLAS interactive analysis BTeV proton-antiproton collision simulation SnB biomolecular analysis GADU/Gnare genone analysis Various computer science experiments

24 Grid2003 Usage

25 10M events produced: largest ever contribution Almost double the number of events during first 25 days vs. 2002: with half the manpower  Production run with 1 person working 50%  400 jobs at once vs. 200 previous year Multi-VO sharing Grid2003 Scientific Impact: E.g., U.S. CMS 2003 Production Continuing at an accelerating rate into 2004 Many issues remain: e.g., scaling, missing functionality

26 Grid2003 as CS Research Lab: E.g., Adaptive Scheduling Adaptive data placement in a realistic environment (K. Ranganathan) Enables comparisons with simulations

27 Grid2003 Lessons Learned How to operate a Grid  Add sites, recover from errors, provide information, update software, test applications, …  Tools, services, procedures, docs, organization  Need reliable, intelligent, skilled people How to scale algorithms, software, process  “Interesting” failure modes as scale increases  Increasing scale must not overwhelm human resources How to delegate responsibilities  At Project, Virtual Org., service, site, appln level  Distribution of responsibilities for future growth How to apply distributed cyberinfrastructure

28 Summary: We Are Building Cyberinfrastructure … GriPhyN/iVDGL (+ DOE PPDG & LHC, etc.) are  Creating an (inter)national-scale, multi-disciplinary infrastructure for distributed data-intensive science;  Demonstrating the utility of such infrastructure via a broad set of applications (not just physics!);  Learning many things about how such infrastructures should be created, operated, and evolved; and  Capturing best practices in software & procedures, including VDT, Pacman, monitoring tools, etc. Unique scale & application breadth  Grid3: 10 apps (science & CS), 28 sites, 2800 CPUs, 1300 jobs, and growing rapidly CS-applications-operations partnership  Having a strong impact on all three

29 … And Are Open for Business Virtual Data Toolkit  Distributed workflow & data management & analysis  Data replication, data provenance, etc.  Virtual organization management  Globus Toolkit, Condor, and other good stuff Grid2003  Adapt your applications to use VDT mechanisms and obtain order-of-magnitude increases in performance  Add your site to Grid2003 & join a national-scale cyberinfrastructure  Propose computer science experiments in a unique environment  Write an NMI proposal to fund this work

30 For More Information GriPhyN  iVDGL  PPDG  Grid2003  Virtual Data Toolkit   2nd Edition