Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)1 The GriPhyN Project (Grid Physics Network) Paul Avery University of Florida

Slides:



Advertisements
Similar presentations
International Grid Communities Dr. Carl Kesselman Information Sciences Institute University of Southern California.
Advertisements

Data Management Reading: Chapter 5: Data-Intensive Computing And A Network-Aware Distributed Storage Cache for Data Intensive Environments.
High Performance Computing Course Notes Grid Computing.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) IRNC Kick-Off Workshop July 13,
Distributed IT Infrastructure for U.S. ATLAS Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Data Grids for Next Generation Experiments Harvey B Newman California Institute of Technology ACAT2000 Fermilab, October 19, 2000
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
US-CMS Meeting (May 19, 2001)Paul Avery1 US-CMS Meeting (UC Riverside) May 19, 2001 Grids for US-CMS and CMS Paul Avery University of Florida
Parallel Programming on the SGI Origin2000 With thanks to Moshe Goldberg, TCC and Igor Zacharov SGI Taub Computer Center Technion Mar 2005 Anne Weill-Zrahia.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
Workload Management Massimo Sgaravatto INFN Padova.
Knowledge Environments for Science: Representative Projects Ian Foster Argonne National Laboratory University of Chicago
LIGO- GXXXXXX-XX-X Advanced LIGO Data & Computing Material for the breakout session NSF review of the Advanced LIGO proposal Albert Lazzarini Caltech,
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
October 24, 2000Milestones, Funding of USCMS S&C Matthias Kasemann1 US CMS Software and Computing Milestones and Funding Profiles Matthias Kasemann Fermilab.
Peer to Peer & Grid Computing Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science The University.
Open Science Grid For CI-Days Internet2: Fall Member Meeting, 2007 John McGee – OSG Engagement Manager Renaissance Computing Institute.
November 15, 2000US CMS Tier2 Plans Matthias Kasemann1 US CMS Software and Computing Tier 2 Center Plans Matthias Kasemann Fermilab DOE/NSF Baseline Review.
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
D0 SAM – status and needs Plagarized from: D0 Experiment SAM Project Fermilab Computing Division.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
Fermilab User Facility US-CMS User Facility and Regional Center at Fermilab Matthias Kasemann FNAL.
CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)1 The Promise of Computational Grids in the LHC Era Paul Avery University of Florida Gainesville,
PPDG and ATLAS Particle Physics Data Grid Ed May - ANL ATLAS Software Week LBNL May 12, 2000.
Finnish DataGrid meeting, CSC, Otaniemi, V. Karimäki (HIP) DataGrid meeting, CSC V. Karimäki (HIP) V. Karimäki (HIP) Otaniemi, 28 August, 2000.
Ruth Pordes, Fermilab CD, and A PPDG Coordinator Some Aspects of The Particle Physics Data Grid Collaboratory Pilot (PPDG) and The Grid Physics Network.
Instrumentation of the SAM-Grid Gabriele Garzoglio CSC 426 Research Proposal.
1 Grid Related Activities at Caltech Koen Holtman Caltech/CMS PPDG meeting, Argonne July 13-14, 2000.
Data Grid projects in HENP R. Pordes, Fermilab Many HENP projects are working on the infrastructure for global distributed simulated data production, data.
10/24/2015OSG at CANS1 Open Science Grid Ruth Pordes Fermilab
14 Aug 08DOE Review John Huth ATLAS Computing at Harvard John Huth.
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
DataTAG Research and Technological Development for a Transatlantic Grid Abstract Several major international Grid development projects are underway at.
GriPhyN EAC Meeting (Jan. 7, 2002)Carl Kesselman1 University of Southern California GriPhyN External Advisory Committee Meeting Gainesville,
GriPhyN EAC Meeting (Apr. 12, 2001)Paul Avery1 University of Florida Opening and Overview GriPhyN External.
Brussels Grid Meeting (Mar. 23, 2001)Paul Avery1 University of Florida Extending the Grid Reach in Europe.
Developing & Managing A Large Linux Farm – The Brookhaven Experience CHEP2004 – Interlaken September 27, 2004 Tomasz Wlodek - BNL.
Data Intensive Science Network (DISUN). DISUN Started in May sites: Caltech University of California at San Diego University of Florida University.
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL Grid Computing for HEP L. E. Price Argonne National Laboratory HEP-CCC Meeting CERN,
DOE/NSF Review (Nov. 15, 2000)Paul Avery (LHC Data Grid)1 LHC Data Grid The GriPhyN Perspective DOE/NSF Baseline Review of US-CMS Software and Computing.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
The Particle Physics Data Grid Collaboratory Pilot Richard P. Mount For the PPDG Collaboration DOE SciDAC PI Meeting January 15, 2002.
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
High Energy Physics and Grids at UF (Dec. 13, 2002)Paul Avery1 University of Florida High Energy Physics.
LIGO-G E LIGO Scientific Collaboration Data Grid Status Albert Lazzarini Caltech LIGO Laboratory Trillium Steering Committee Meeting 20 May 2004.
PPDGLHC Computing ReviewNovember 15, 2000 PPDG The Particle Physics Data Grid Making today’s Grid software work for HENP experiments, Driving GRID science.
GriPhyN EAC Meeting (Jan. 7, 2002)Paul Avery1 Integration with iVDGL è International Virtual-Data Grid Laboratory  A global Grid laboratory (US, EU, Asia,
1 NSF/TeraGrid Science Advisory Board Meeting July 19-20, San Diego, CA Brief TeraGrid Overview and Expectations of Science Advisory Board John Towns TeraGrid.
State of LSC Data Analysis and Software LSC Meeting LIGO Hanford Observatory November 11 th, 2003 Kent Blackburn, Stuart Anderson, Albert Lazzarini LIGO.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
July 26, 1999MONARC Meeting CERN MONARC Meeting CERN July 26, 1999.
GriPhyN Project Paul Avery, University of Florida, Ian Foster, University of Chicago NSF Grant ITR Research Objectives Significant Results Approach.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
Tier 1 at Brookhaven (US / ATLAS) Bruce G. Gibbard LCG Workshop CERN March 2004.
G G riPhyN Project Review Criteria l Relevance to Information Technology l Intellectual Merit l Broader Impacts l ITR Evaluation Criteria (innovation in.
U.S. ATLAS Computing Facilities DOE/NFS Review of US LHC Software & Computing Projects Bruce G. Gibbard, BNL January 2000.
U.S. ATLAS Computing Facilities U.S. ATLAS Physics & Computing Review Bruce G. Gibbard, BNL January 2000.
DOE/NSF Quarterly review January 1999 Particle Physics Data Grid Applications David Malon Argonne National Laboratory
Storage Management on the Grid Alasdair Earl University of Edinburgh.
Northwest Indiana Computational Grid Preston Smith Rosen Center for Advanced Computing Purdue University - West Lafayette West Lafayette Calumet.
1 Open Science Grid: Project Statement & Vision Transform compute and data intensive science through a cross- domain self-managed national distributed.
1 Particle Physics Data Grid (PPDG) project Les Cottrell – SLAC Presented at the NGI workshop, Berkeley, 7/21/99.
10-Feb-00 CERN HepCCC Grid Initiative ATLAS meeting – 16 February 2000 Les Robertson CERN/IT.
Hall D Computing Facilities Ian Bird 16 March 2001.
] Open Science Grid Ben Clifford University of Chicago
Clouds , Grids and Clusters
Nuclear Physics Data Management Needs Bruce G. Gibbard
Presentation transcript:

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)1 The GriPhyN Project (Grid Physics Network) Paul Avery University of Florida Internet 2 Workshop Atlanta Nov. 1, 2000

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)2 Motivation: Data Intensive Science è Scientific discovery increasingly driven by IT  Computationally intensive analyses  Massive data collections  Rapid access to large subsets  Data distributed across networks of varying capability è Dominant factor: data growth  ~0.5 Petabyte in 2000 (BaBar)  ~10 Petabytes by 2005  ~100 Petabytes by 2010  ~1000 Petabytes by 2015? è Robust IT infrastructure essential for science  Provide rapid turnaround  Coordinate, manage the limited computing, data handling and network resources effectively

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)3 Grids as IT Infrastructure è Grid: Geographically distributed IT resources configured to allow coordinated use è Physical resources & networks provide raw capability è “Middleware” services tie it together

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)4 Data Grid Hierarchy (LHC Example) Tier 1 T Tier 0 (CERN) Tier0 CERN Tier1 National Lab Tier2 Regional Center at University Tier3 University workgroup Tier4 Workstation GriPhyN: è R&D è Tier2 centers è Unify all IT resources

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)5 Why a Data Grid: Physical è Unified system: all computing resources part of grid  Efficient resource use (manage scarcity)  Resource discovery / scheduling / coordination truly possible  “The whole is greater than the sum of its parts” è Optimal data distribution and proximity  Labs are close to the (raw) data they need  Users are close to the (subset) data they need  Minimize bottlenecks è Efficient network use  local > regional > national > oceanic  No choke points è Scalable growth

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)6 Why a Data Grid: Demographic è Central lab cannot manage / help 1000s of users  Easier to leverage resources, maintain control, assert priorities at regional / local level è Cleanly separates functionality  Different resource types in different Tiers  Organization vs. flexibility  Funding complementarity (NSF vs DOE), targeted initiatives è New IT resources can be added “naturally”  Matching resources at Tier 2 universities  Larger institutes can join, bringing their own resources  Tap into new resources opened by IT “revolution” è Broaden community of scientists and students  Training and education  Vitality of field depends on University / Lab partnership

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)7 GriPhyN = Applications + CS + Grids è Several scientific disciplines  US-CMSHigh Energy Physics  US-ATLASHigh Energy Physics  LIGO/LSCGravity wave research  SDSSSloan Digital Sky Survey è Strong partnership with computer scientists è Design and implement production-scale grids  Maximize effectiveness of large, disparate resources  Develop common infrastructure, tools and services  Build on foundations  Globus, PPDG, MONARC, Condor, …  Integrate and extend existing facilities è  $70M total cost  NSF(?)  $12MR&D  $39MTier 2 center hardware, personnel, operations  $19M?Networking

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)8 Particle Physics Data Grid (PPDG) u First Round Goal: Optimized cached read access to Gbytes drawn from a total data set of 0.1 to ~1 Petabyte u Matchmaking, Co-Scheduling: SRB, Condor, Globus services; HRM, NWS PRIMARY SITE Data Acquisition, CPU, Disk, Tape Robot SECONDARY SITE CPU, Disk, Tape Robot Site to Site Data Replication Service 100 Mbytes/sec ANL, BNL, Caltech, FNAL, JLAB, LBNL, SDSC, SLAC, U.Wisc/CS Multi-Site Cached File Access Service University CPU, Disk, Users PRIMARY SITE DAQ, Tape, CPU, Disk, Robot Satellite Site Tape, CPU, Disk, Robot University CPU, Disk, Users University Users University Users University Users Satellite Site Tape, CPU, Disk, Robot

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)9 Grid Data Management Prototype (GDMP) Distributed Job Execution and Data Handling: Goals  Transparency  Performance  Security  Fault Tolerance  Automation Submit job Replicate data Replicate data Site A Site B Site C r Jobs are executed locally or remotely r Data is always written locally r Data is replicated to remote sites Job writes data locally GDMP V1.0: Caltech + EU DataGrid WP2; for CMS “HLT” Production in October

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)10 GriPhyN R&D Funded! è NSF/ITR results announced Sep. 13  $11.9M from Information Technology Research Program  $ 1.4M in matching from universities  Largest of all ITR awards  Excellent reviews emphasizing importance of work  Joint NSF oversight from CISE and MPS è Scope of ITR funding  Major costs for people, esp. students, postdocs  No hardware or professional staff for operations !  2/3 CS + 1/3 application science  Industry partnerships being developed Microsoft, Intel, IBM, Sun, HP, SGI, Compaq, Cisco Still require funding for implementation and operation of Tier 2 centers

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)11 GriPhyN Institutions  U Florida  U Chicago  Caltech  U Wisconsin, Madison  USC/ISI  Harvard  Indiana  Johns Hopkins  Northwestern  Stanford  Boston U  U Illinois at Chicago  U Penn  U Texas, Brownsville  U Wisconsin, Milwaukee  UC Berkeley  UC San Diego  San Diego Supercomputer Center  Lawrence Berkeley Lab  Argonne  Fermilab  Brookhaven

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)12 Fundamental IT Challenge “Scientific communities of thousands, distributed globally, and served by networks with bandwidths varying by orders of magnitude, need to extract small signals from enormous backgrounds via computationally demanding (Teraflops-Petaflops) analysis of datasets that will grow by at least 3 orders of magnitude over the next decade: from the 100 Terabyte to the 100 Petabyte scale.”

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)13 GriPhyN Research Agenda è Virtual Data technologies  Derived data, calculable via algorithm (e.g., 90% of HEP data)  Instantiated 0, 1, or many times  Fetch vs execute algorithm  Very complex (versions, consistency, cost calculation, etc) è Planning and scheduling  User requirements (time vs cost)  Global and local policies + resource availability  Complexity of scheduling in dynamic environment (hierarchy)  Optimization and ordering of multiple scenarios  Requires simulation tools, e.g. MONARC

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)14 Virtual Data in Action l Data request may l Compute locally l Compute remotely l Access local data l Access remote data l Scheduling based on l Local policies l Global policies l Local autonomy

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)15 Research Agenda (cont.) è Execution management  Co-allocation of resources (CPU, storage, network transfers)  Fault tolerance, error reporting  Agents (co-allocation, execution)  Reliable event service across Grid  Interaction, feedback to planning è Performance analysis (new)  Instrumentation and measurement of all grid components  Understand and optimize grid performance è Virtual Data Toolkit (VDT)  VDT = virtual data services + virtual data tools  One of the primary deliverables of R&D effort  Ongoing activity + feedback from experiments (5 year plan)  Technology transfer mechanism to other scientific domains

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)16 GriPhyN: PetaScale Virtual Data Grids Virtual Data Tools Request Planning & Scheduling Tools Request Execution & Management Tools Transforms Distributed resources (code, storage, computers, and network ) è Resource è Management è Services Resource Management Services è Security and è Policy è Services Security and Policy Services è Other Grid è Services Other Grid Services Interactive User Tools Production Team Individual Investigator Workgroups Raw data source

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)17 LHC Vision (e.g., CMS Hierarchy) Tier2 Center Online System Offline Farm, CERN Computer Center > 20 TIPS France Center FNAL Center Italy Center UK Center Institute Institute ~0.25TIPS Workstations, other portals ~100 MBytes/sec ~2.4 Gbits/sec Mbits/sec Bunch crossing per 25 nsecs. 100 triggers per second Event is ~1 MByte in size Physicists work on analysis “channels”. Each institute has ~10 physicists working on one or more channels Physics data cache ~PBytes/sec ~ Gbits/sec + Air Freight Tier2 Center ~622 Mbits/sec Tier 0 +1 Tier 1 Tier 3 Tier 4 Tier2 Center Tier 2 GriPhyN: FOCUS On University Based Tier2 Centers Experiment

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)18 SDSS Vision Three main functions: Main data processing (FNAL)  Processing of raw data on a grid  Rapid turnaround with multiple TB data  Accessible storage of all imaging data Fast science analysis environment (JHU)  Combined data access and analysis of calibrated data  Shared by the whole collaboration  Distributed I/O layer and processing layer  Connected via redundant network paths Public data access  Provide the SDSS data for the NVO (National Virtual Observatory)  Complex query engine for the public  SDSS data browsing for astronomers, and outreach

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)19 Principal areas of GriPhyN applicability: Main data processing (Caltech/CACR)  Enable computationally limited searches  periodic sources)  Access to LIGO deep archive  Access to Observatories Science analysis environment for LSC (LIGO Scientific Collaboration)  Tier2 centers: shared LSC resource  Exploratory algorithm, astrophysics research with LIGO reduced data sets  Distributed I/O layer and processing layer builds on existing APIs  Data mining of LIGO (event) metadatabases  LIGO data browsing for LSC members, outreach LIGO Vision Hanford Livingston Caltech MIT INet2 Abilene Tier1 LSC Tier2 OC3 OC48 OC3 OC12 OC48 LIGO I Science Run:2002 – LIGO II Upgrade:2005 – 20xx (MRE to NSF 10/2000)

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)20 GriPhyN Cost Breakdown (May 31)

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)21 Number of Tier2 Sites vs Time (May 31)      

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)22 LHC Tier2 Architecture and Cost  Linux Farm of 128 Nodes (256 CPUs + disk) $ 350 K  Data Server with RAID Array$ 150 K  Tape Library$ 50 K  Tape Media and Consumables$ 40 K  LAN Switch$ 60 K  Collaborative Tools & Infrastructure$ 50 K  Installation & Infrastructure$ 50 K  Net Connect to WAN (Abilene)$ 300 K  Staff (Ops and System Support)$ 200 K   Total Estimated Cost (First Year)$1,250 K  Average Yearly Cost including evolution,$ 750K upgrade and operations   1.5 – 2 FTE support required per Tier2  Assumes 3 year hardware replacement

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)23 Tier 2 Evolution (LHC Example)  Linux Farm:5,000 SI95 50,000 SI95   Disks on CPUs 4 TB40 TB  RAID Array  2 TB20 TB  Tape Library4 TB TB  LAN Speed Gbps Gbps  WAN Speed Mbps Gbps  CollaborativeMPEG2 VGARealtime HDTV Infrastructure( Mbps)( Mbps)  RAID disk used for “higher availability” data  Reflects lower Tier2 component costs due to less demanding usage, e.g. simulation.

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)24 Current Grid Developments è EU “DataGrid” initiative  Approved by EU in August (3 years, $9M)  Exploits GriPhyN and related (Globus, PPDG) R&D  Collaboration with GriPhyN (tools, Boards, interoperability, some common infrastructure)  è Rapidly increasing interest in Grids  Nuclear physics  Advanced Photon Source (APS)  Earthquake simulations (  Biology (genomics, proteomics, brain scans, medicine)  “Virtual” Observatories (NVO, GVO, …)  Simulations of epidemics (Global Virtual Population Lab) è GriPhyN continues to seek funds to implement vision

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)25 The management problem

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)26 PPDG BaBar Data Management BaBar D0 CDF Nuclear Physics CMSAtlas Globus Users SRB Users Condor Users HENP GC Users CMS Data Management Nucl Physics Data Management D0 Data Management CDF Data Management Atlas Data Management Globus Team Condor SRB Team HENP GC

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)27 Philosophical Issues è Recognition that IT industry is not standing still  Can we use tools that industry will develop?  Does industry care about what we are doing?  Many funding possibilities è Time scale  5 years is looooonng in internet time  Exponential growth is not uniform. Relative costs will change  Cheap networking? (10 GigE standards)

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)28 Impact of New Technologies è Network technologies  10 Gigabit Ethernet: 10GigE  DWDM-Wavelength (OC-192)  Optical switching networks  Wireless Broadband (from ca. 2003) è Internet information software technologies  Global information “broadcast” architecture E.g, Multipoint Information Distribution Protocol:  Programmable coordinated agent architectures E.g. Mobile Agent Reactive Spaces (MARS), Cabri et al. è Interactive monitoring and control of Grid resources  By authorized groups and individuals  By autonomous agents è Use of shared resources  E.g., CPU, storage, bandwidth on demand

Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)29 Grids In 2000 è Grids will change the way we do science, engineering  Computation, large scale data  Distributed communities è The present  Key services, concepts have been identified  Development has started  Transition of services, applications to production use è The future  Sophisticated integrated services and toolsets (Inter- and IntraGrids+) could drive advances in many fields of science & engineering è Major IT challenges remain  An opportunity & obligation for collaboration